Introduction

Rhizobia are members of the family Rhizobiaceae, classically recognized as symbiotic bacteria of leguminous plants that have the characteristic feature of fixing atmospheric nitrogen (Hellriegel and Wilfarth 1888). The group comprises a large number of genera that nodulate more than 750 genera of legumes (Wojciechowski et al. 2004). The taxonomic status of rhizobia remain dynamic, as new rhizobial species are being identified on regular basis, and also, the known genera had been re-assigned or re-classified (Hernandez-Lucas et al. 1995). Rhizobial species so far identified are very diverse and exhibited phylogenetically distinct groups. Previously, widespread phylogenetic diversity of nitrogen-fixing legume symbionts and their taxonomy had been reported (Rivas et al. 2009). Until 1980s, all symbiotic rhizobia isolated from leguminous plants were classified as belonging to Rhizobium genus (Zakhia and de Lajudie 2001), however in 1984; the taxonomy changed and continues to evolve till today. Rhizobial taxonomic studies have currently led to a total of 21 genera (Chen et al. 2021; Kuzmanovic et al. 2022) and progress in taxonomy is due to increasing numbers of effective techniques available for characterization of bacteria (Ormeno-Orrillo et al. 2015; Lassalle et al. 2021). While 16S rRNA gene sequence is considered as benchmark for description of rhizobial species (Graham et al. 1991), yet other technological developments in genetic analysis including DNA fingerprinting techniques, Polymerase chain reaction (PCR) analysis using large number of genes, Restriction fragment length polymorphism (RFLP), had contributed to defining and differentiating the closest strains of rhizobia (Ramirez-Bahena et al. 2008). Recently, next generation sequencing (NGS) techniques have been assimilated in rhizobial taxonomy with strategies such as—comparative genomics (Ormeno-Orrillo et al. 2015), average nucleotide identity (ANI) of genome comparisons (Rashid et al. 2015), core genome phylogeny, core-proteome average average amino acid identity (cpAAI) (Kuzmanovic et al. 2022), high throughput sequencing of 16S rRNA for bacterial diversity (Zheng et al. 2020). In fact, advancement in molecular biology techniques has facilitated considerable changes and proposal of new rhizobial species.

Recently, the post genomics technologies have encouraged creation of several algorithms that are introducing new genome-based definitions for the taxonomy of prokaryotes. These algorithms have been widely accepted and provide valuable insights of microbial speciation and genomic diversity (Zong 2020). NGS technologies has led to the discovery of microbial phylogenetic novelty and enable the researchers to (re-)classify and (re-)name organisms and explore diverse natural microbial communities and their uncultivated taxa (Sanford et al. 2021). Theory on prokaryotic genome evolution has been progressed with comparative genome analysis covering a wide range of evolutionary distances and this could change the concepts of prokaryotic taxonomy (Koonin et al. 2021), including rhizobia. Phylogenomics gives exact strategies to depict species and permits us to derive the phylogeny at higher ordered taxonomic positions, as well as those at the subspecies level.

There had been some interesting reviews, which have discussed the taxonomy of legume nodulating bacteria (Berrada and Fikri-Benbrahim 2014; Shamseldin et al. 2017). But as the number of new genera had been reported, or re-classified, an updated description is required. Here in this review, we summarize the various constant developments in identification of rhizobia using recent techniques including genomics-based strategies. New approaches have led to the reclassification of several genera resulting in considerable changes in taxonomy and nomenclature of rhizobia. The postgenomics technologies are significantly changing current scientific classification of rhizobia. Therefore, this review describes the developments in rhizobial taxonomy, considering the technological advancements and progress in molecular perspectives, and also presents the currently recognized classification of different genera of rhizobia.

Historical antecedents: The historical perspectives in rhizobial taxonomy can be categorized under two sections (i) initial classification based on culture attributes, and (ii) numerical taxonomy based on phenotypic characteristics, as summarized below:

Culture attributes

Young and Haukka (1996) described isolation and culturing of root-nodule bacteria by Beijerinck, (1888), (as cited in Young and Haukka 1996) which was named as Bacillus radicicola, and later it was renamed as Rhizobium leguminosarum by Frank (1889) as a type of strain of the Rhizobium genus (Willems 2006). The original genus Rhizobium underwent several changes that gave rise to numerous taxa. Until 1980s, all symbiotic nitrogen fixing bacteria were identified as Rhizobium, classified into six species (R. leguminosarum, R. trifolii, R. meliloti, R. phaseoli, R. japonicum and R. lupine) (Fred et al. 1932; Jordan and Allen 1974). Jordan (1984) classified the second genus Bradyrhizobium based on slow and fast-growing rhizobia, this led to transfer of Rhizobium japonicum to the genus Bradyrhizobium. Baldwin and Fred (1929) developed cross-inoculation tests to assess the specificity of Rhizobium with their host plants. This aided to classify rhizobia into two categories depending on their growth rates viz. fast-growers and slow growers (Lohnis and Hansen 1921; Fred et al. 1932). These two groups of rhizobia had been shown to exhibit intragenic and intergenic diversity (Elkan 1992). Both groups exhibited metabolic diversity, as fast-growing bacteria could utilize mannitol and sucrose (Alien and Allen 1950), while slow-growing bacteria utilize arabinose (Fred et al. 1932) as their carbon source. This principle becomes less acceptable to classify rhizobial taxonomy and so Wilson (1944) provided evidence to abandon cross inoculation group concept. This was also not helpful due to the possibility of transfer of symbiotic plasmids among soil bacteria (Nakatsukasa et al. 2008). While position of symbiotic genes was used to differentiate between fast and slow growers, as these are located on chromosome for slow growing bradyrhizobia and for fast growers they are located on plasmids. It was reported that a strain of Bradyrhizobium DOA9 carry symbiotic genes on a megaplasmid (Teamtisong et al. 2013). In fact, rhizobial genes for symbioses in legumes are not as stable as those present in chromosome, rather they are located on large plasmids. In most of the Rhizobium strains, genes encoding root hair adhesion, nitrogen fixation, infection thread formation and host specificity are found on one plasmid species (Djordjevlc et al. 1982) and these genes are located on one segment of this Sym plasmid that range approximately 20–30 kilobase pairs (kb) (Homnbrecher et al. 1981). Thus, rhizobial symbiotic plasmids play an important role in symbiosis and contains core symbiosis genes (nod and nif/fix) involved in functioning of nitrogen fixation and nodulation. Wang et al. (2018) compared 24 rhizobial symbiotic plasmids which showed significant different topological structures when compared to phylogenetic trees constructed using nodCIJ and fixABC genes. Rhizobial symbiotic plasmids retain a mosaic structure due to transposition, horizontal gene transfer and plasmid DNA recombination (Lopez-Guerrero et al. 2012a, b), because of which, such plasmid borne functions are avoided for taxonomic purpose (Saidi et al. 2014).

In the second half of the twentieth century, traditional phenotypic methods such as morphophysiological characteristics, growth kinetics, and pH of the growth medium were used to identify rhizobia (Vincent and Humphrey 1970). Major changes in the nomenclature of rhizobia occurred when rhizobia were classified with other methods such as polyphasic approach that includes phenotypic, genotypic, and phylogenetic analysis, serology, RNA/DNA or DNA/DNA hybridization, and/or plasmid analysis, since previous methods (host-range nodulation and growth rates) were inconsistent (Vandamme et al. 1996; Rao et al. 2018). As a result, the number of rhizobial species increased rapidly (Table S1) (Zakhia and Lajudie 2001).

Phylogenetic analysis of 16S rRNA gene led to division of genus Rhizobium, placing Rhizobium, Agrobacterium and Allobacterium in a group whereas Sinorhizobium, Bradyrhizobium, Mesorhizobium and Azorhizobium formed separate clusters (Willems, 2006). Multi Locus Sequencing Analysis (MLSA) using housekeeping genes, which had been used to identify and delineate at species level, was also recognized for rhizobia (Rivas et al. 2009; Aserse et al. 2012). Before the delineation of new generic names, rhizobial species such as Sinorhizobium and Mesorhizobium were placed under Rhizobium (Lindstrom et al. 2015). There was proposal to integrate closely related genera Agrobacterium and Allorhizobium into the genus Rhizobium, and the merger of Sinorhizobium with Ensifer, which has been much debated (Young et al. 2001; Willems et al. 2003).

However, analytical methods have improved since the last 30 years and the emergence of whole genome sequence analysis now facilitates recognition of a novel species, which is being used presently as a powerful tool to study taxonomy of rhizobia as revealed from comparative genome sequence studies. Measuring the genomic relatedness aid in demarcation of genus and it allows delineation of closely related species into separate genera. Taxonomy based on genome sequencing (genotaxonomy) offer a clear concept of identification of correct species as well as explore novel rhizobial species that are yet to be isolated from different legume species. A schematic timeline diagram is given (Fig. 1) to illustrate the major breakthroughs vis-a-vis technical advances, in rhizobial taxonomy.

Fig. 1
figure 1

A schematic diagram to illustrate major breakthroughs vis-a-vis technical advances, in rhizobial taxonomy

Numerical taxonomy of rhizobia based on phenotypic characteristics

The numerical taxonomy approach was applied for rhizobia, based on phenotypic characteristics including morphology, physiological, serological analysis, symbiotic characteristics, utilization of carbon and nitrogen sources, metabolic features and other abiotic growth factors (Graham et al. 1991). Azorhizobium, a new genus was discovered using numerical taxonomy approach, as it was found to have different characteristics from other fast-growing rhizobia. It could utilize numerous carbohydrates and exhibited a separate branch from Rhizobium and Bradyrhizobium (Dreyfus et al. 1988). Also, three Rhizobium strains (R. leguminosarum, R. phaseoli and R. trifolii) were classified under the same species by numerical taxonomy approach; previously classified based on cross-nodulation. R. japonicum and R. lupini were clustered in a phenotypic group and fast growers (R. leguminosarum, R. phaseoli, R. trifolii, R. meliloti) were observed to be similar to Agrobacterium. Following this numerical taxonomy, Rhizobium classification was then re-organized that resulted to the identification of another rhizobial genus Sinorhizobium (Chen et al. 1988). Based on the physiological features, utilisation of carbon sources of alcohols, sugars, organic acids, and enzyme activities, Sinorhizobium xinjiangense was reclassified into a separate species which was previously classified with Sinorhizobium fredii (Chen et al. 1988). Genus Mesorhizobium was classified based on phenotypic characteristics including nodulation and physiological properties, and the five Rhizobium species (R. huakuii, R. ciceri, R. tianshanense, R. loti, and R. mediterraneum) had shifted to Mesorhizobium. It was revealed to be phylogenetically different from other rhizobia such as Rhizobium, Sinorhizobium, Agrobacterium and related groups. Mesorhizorium was described to exhibit intermediate growth between fast-grower and and slow grower. The population of this genus utilize glucose, rhamnose and sucrose with acid end products (Jarvis et al. 1997).

Most of the rhizobial population were classified under Proteobacteria, mainly belonging to the Class Alpha-proteobacteria (α-proteobacteria), Beta-proteobacteria (β-proteobacteria) and Gamma-proteobacteria (γ-proteobacteria) (Shiraishi et al. 2010). In α-Proteobacteria, six families, comprising Bradyrhizobiaceae, Brucellaceae, Hyphomocrobiaceae, Methylobacteriaceae, Phylobacteriaceae, and Rhizobiaceae belonging to the order Rhizobiales were defined. Rhizobial genera were increased to 12 with 44 species (Sawada et al. 2003), and soon after again revised to 53 rhizobial species belonging to Allorhizobium, Agrobacterium, Azorhizobium, Bradyrhizobium, Mesorhizobium, Rhizobium, and Sinorhizobium (Willems 2006). Later Berrada and Fikri-Benbrahim (2014), reported 14 rhizobia genera with 98 species, Suneja et al. (2017) listed 17 rhizobial genera with 168 validly published species, which was updated to 238 species distributed among 18 rhizobial genera as described by Shamseldin et al. (2017). Chen et al. (2021) detailed 20 genera of rhizobia of different families: Rhizobiaceae [Allorhizobium, Agrobacterium, Ensifer (syn. Sinorhizobium), Neorhizobium, Pararhizobium, Rhizobium, Shinella], Bradyrhizobiaceae (Bradyrhizobium), Brucellaceae (Ochrobactrum), Hyphomicrobiaceae (Devosia), Xanthobacteraceae (Azorhizobium), Phyllobacteriaceae (Aminobacter, Mesorhizobium, Phyllobacterium), Methylobacteriaceae (Methylobacterium, Microvirga), Burkholderiaceae (Paraburkholderia, Trinickia, Cupriavidus), Pseudomonadaceae (Pseudomonas). Also, Kuzmanovic et al. (2022) proposed formation of new rhizobial genus Xaviernesmea. The second class β-Proteobacteria was found to be less diverse, as it included one family—Burkholderiales, consisting of three genera Paraburkholderia, Cupriavidus and Trinickia (Estrada-de los Santos et al. 2018). Initially Cupriavidus was described as Ralstonia (Chen et al. 2001), and Paraburkholderia and Trinickia were formerly described as some species of Burkholderia (Dobritsa and Samadpour 2016; Estrada-de los Santos et al. 2018).

Phylogeny and taxonomy of rhizobia based on small subunit (SSU) ribosomal RNA

Submolecular phylogenetics emerged as a powerful tool to decipher evolutionary relationships between bacteria, by utilizing molecular data (DNA and rRNA or protein sequences) (Dai et al. 2012). 16S rRNA gene is regarded as the phylogenetic marker in the field of microbial taxonomy (Stackerbrandt and Goebel 1994). 16S rRNA based grouping of fast and slow growing rhizobia were clearly segregated in genetic phyla, as these groups were found to be less related to each other, rather than to their nonsymbiotic relatives. For instance, Rhizobium was found to be closely related to Agrobacterium, while slow growing rhizobia had close relationship with Pseudomonas palustris (Young and Johnston 1989). 16S rRNA sequence alignment, clearly distinguished rhizobia into three respective genera as was already described by previous methods—Azorhizobium, Bradyrhizobium and Rhizobium (Young et al. 1991; Willems and Collins 1993). 16S rRNA gene sequence analysis was in agreement to the classification of rhizobia at genus level, with previous strategies, but was more definitive. Classification of rhizobia into five genera i.e., Rhizobium, Azorhizobium Sinorhizobium, Meshorhizobium and Bradyrhizobium was supported by analysing 16S rRNA sequences of recognised seventeen species of four rhizobium genera (Young and Haukka 1996). Phylogenetic analysis of 16S rRNA sequence led to the division of Rhizobium genus and its relatives of α-Proteobacteria.

Different species of Agrobacterium, Allorhizobium undicola clustered together with all species of Rhizobium according to 16S rDNA analyses. Hence, Rhizobium, Allorhizobium and Agrobacterium (Rhizobiaceae) were merged with the Rhizobium genus due to the monophyletic nature and their common phenotypic generic contraint (Young et al. 2001). Whereas Azorhizobium (Hyphomicrobiaceae), Bradyrhizobium (Bradyrhizobiaceae), Mesorhizobium (Phyllobacteriaceae), Sinorhizobium (Rhizobiaceae) formed separate clusters (Willems 2006). The genus Rhizobium had incorporated both Allorhizobium and Agrobacterium genera, while Chelatobacter was renamed as Aminobacter (Young et al. 2001; Kampfer et al. 2002) and Sinorhizobium have been known as Ensifer (Young 2010) based on 16S rDNA sequence analysis. Rhizobium and Sinorhizobium showed close relationship with Agrobacterium while distantly related with Bradyrhizobium (Garrity et al. 2005) and Phyllobacterium (Mergaert and Swings 2006). Later, isolation and identification of Agrobacterium species resulted in changes of nomenclature of rhizobial species (Slater et al. 2013). Agrobacterium rhizogenes, an old species was retained as Rhizobium rhizogenes, and also a new species Rhizobium tumorigenes was included that induce plant tumours (Kuzmanovic et al. 2018). The controversy was moderated by reclassification of Agrobacterium larrymoorei as Rhizobium larrymoorei (Young 2004).

Ensifer (Sinorhizobium), Mesorhizobium and Rhizobium, fall under α-proteobacteria and Burkholderia and/or Paraburkholderia, Cupriavidus, belong to β-Proteobacteria (Andrew and Andrews 2017). Many rhizobial species had been reported to share high homology (> 97%) or else they were almost similar with 16S rRNA sequence (Moura et al. 2020). Based on 16S rRNA sequence similarity, rhizobia were reported to belong to three main distinct phylogenetic subclasses i.e., α, β and γ-Proteobacteria (Zakhia and de Lajudie 2001). In Fig. 2, the phylogeny of the rhizobial species belonging to three distinct subclasses, with representative species of rhizobial genera had been shown.

Fig. 2
figure 2

Maximum likelihood phylogenetic tree of 16S rRNA gene of 61 representative species of 26 genera of Rhizobia. Scale bar (0.05) indicates estimated nucleotide substitution per site

The usage of 16S rRNA gene sequence as phylogenetic marker in rhizobia presented some challenges as well, as some of the bacterial genomes possess multiple copies of the sequence and was suggested to develop vulnerability to horizontal gene transfer (van Berkum et al. 2003; Gevers et al. 2005). For instance, symbiotic rhizobia isloted from Mimosa spp. were highly specific, and the phylogenies based on 16S rRNA, and housekeeping gene sequences were observed to be different. Further, housekeeping gene sequences were reported to represent the diversity, in line with the symbiosis genes for Burkholderia (isolated from Brazil) and Rhizobium/Ensifer (isolated from Mexico) (Bontemps et al. 2010, 2016). Therefore, the efficacy of 16S rRNA was criticized for rhizobial taxa and other housekeeping genes were being given preference in delineating new species of rhizobia (Aserse et al. 2012), as also it cannot be used to differentiate among the closest Rhizobium species (Ramirez-Bahena et al. 2008). Further, 16S rRNA gene sequence of α- and β-proteobacteria are highly conserved, so discrimination of diverse species remains challenging, therefore other complementary approaches were used (Azevedo et al. 2015) as discussed below.

Taxonomy based on housekeeping genes

Several housekeeping gene sequences had been used to identify of rhizobia at the genus level and delineate rhizobial species with high relatedness (Rivas et al. 2009). This included nitrogen fixation genes (nif, fix, x genes) and nodulation genes (nodABCIJ genes) that are located within genomic regions or symbiotic plasmids in most of the α-rhizobia groups (Suominen et al. 2001). Diversity of rhizobial population had been assessed by analysing nodC and nifH gene (Dubey et al. 2010). Analysis of combination of other gene sequence such as dnaK (Stepkowski et al. 2003), glnII (Stepkowski et al. 2005), atpD and recA (Vinuesa et al. 2005) had elucidated the rhizobial phylogenetic relationship. Genes such as atpD, recA and glnII help in differentiation of closely related species of R. leguminosarum sv. trifolii, R. leguminosarum sv. phaseoli and R. leguminosarum sv. viceae (Ribeiro et al. 2009). recA gene screening was found to resolve and define rhizobial strains at genus and species level (Lindstrom et al. 2015; Peix et al. 2015). In rhizobial taxa, recA gene which code for DNA recombination and repair system had demonstrated to be similar with the small subunit rRNA genes (Gaunt et al. 2001; Vinuesa et al. 2005). Further, phylogenetic analysis of recA in bacteria had been observed to be consistent with the corresponding phylogeny of 16S rRNA gene (Eisen 1995). Figures 2, 3 and 4 in this review present Maximum Likelihood (ML) phylogenetic trees that dipict the evolutionary relationships among rhizobial genera based on analysis of the 16S, recA and atdD genes respectively. The gene sequences were retrieved from GenBank and trees were constructed based on Tamura–Nei model (1993), and drawn to scale with branch lengths measured in the number of substitution per site. The recA and atpD gene have been sequenced in all the rhizobial strains of all genera and they had been used to differentiate between rhizobial species for those species whose 16S rRNA had been found to be closely related (Valverde et al. 2006; Ramirez-Bahena et al. 2008). Young et al. (2001) classified Agrobacterium as genus Rhizobium based on phylogenetic relationship of rrs gene sequence which endured a conflict, and were disapproved by different scientist (Farrand et al. 2003). Therefore, the taxonomic classification of Agrobacterium was reformed (Mousavi et al. 2014), based on rrs, recA, atpD and rpoB gene sequences. Subsequently, some Rhizobium species (R. pusense, R. skierniewicense, and R. nepotum) were shifted to genus Agrobacterium and R. vitis (primarily A. vitis) was shifted to genus Allorhizobium (Oren and Garrity 2016). Furthermore, other housekeeping genes of rhizobia such as dnaK, gap, glnA, gltA, gyrB, pnp, recA, rpoB, and thrC had been used to identify precisely (Aoki et al. 2013). On the otherhand, nodA gene sequences of Cupriavidus rhizobia isolated from Uruguay were reported to be inconsistent with the housekeeping gene sequences however they were placed in the same clade which indicated several species of the group acquired symbiosis genes through horizontal gene transfer (Platero et al. 2016). The symbiosis gene sequences (nodA, nodC, nifH and nifHD) of Burkholderia (Paraburkholderia) sp. and Pseudomonas sp. were found to be identical to other rhizobial species which indicated that the genes had acquired by horizontal gene transfer (Shiraishi et al. 2010). Careful analysis of these housekeeping genes of each genus revealed incongruent phylogenetic relationships among these loci that lead to improve identification and characterization of rhizobia (Werner et al. 2015).

Fig. 3
figure 3

Maximum likelihood phylogenetic tree of recA gene of 51 representative species of 23 genera of Rhizobia

Fig. 4
figure 4

Maximum likelihood phylogenetic tree of atpD gene of 47 representative species of 21 genera of Rhizobia

PCR-based techniques in rhizobial taxonomy

The use of PCR-based techniques such as restriction fragment length polymorphism (RFLP), amplified fragment length polymorphism (AFLP), and random amplified fragment polymorphic DNA (RAPD) have facilitated in determining the genetic variation in rhizobia (Silva et al. 2012; Onyango et al. 2015; Boakye et al. 2016). Universal and specific primers including 16S–23S rRNA ITS (Internally Transcribed Space) region of different rhizobial strains had been used in amplification and sequencing to distinguish taxonomic positions of different rhizobial isolates (Gronemeyer et al. 2014). Rahmani et al. (2011) analysed common-bean nodulating rhizobia by PCR–RFLP technique and reported that the isolates had shown large genetic variation and comprised 43 ITS genotypes that showed clustering into ten groups at a similarity of 64%. PCR and amplified ribosomal DNA restriction analysis (ARDRA) methods of 41 rhizobial isolates from root nodules of beans categorized them into nine separate morphotypes (Koskey et al. 2018). RAPD-PCR was used by Harrison et al. (1992) in defining strains of R. leguminosarum and Niemann et al. (1997) to characterize among indigenous S. meliloti strains.

Similarly, different PCR fingerprinting techniques such as 16S rDNA PCR–RFLP, rep-PCR and RAPD analysis had shown considerable diversity among eighteen soybean nodule isolates. RFLP patterns indicated that the isolates were different from Bradyrhizobium elkani and Sinorhizobium fredii and showed close relatedness with Bradyrhizobium japonicum (Sikora and Redzepovic 2003). Ogutcu et al. (2009) characterized R. leguminosarum subsp. ciceri isolates associated with chickpea species and revealed high intraspecies diversity among the strains using different PCR techniques. Characterization of exopolysaccharide producing R. leguminosarum species using PCR-based methods could discriminate among R. leguminosarum strains, R. etli and R. gallicum (Janczarek et al. 2009). Genetic relationship and diversity of rhizobial isolates from Lembotropis nigricans displayed great heterogeneicity, as out of 33 rhizobia, AFLP techniques could demarcate 32 genotypes and BOX-PCR could identify 27 genotyes and identified root nodule symbionts belong to Bradyrhizobium japonicum (Wojcik et al. 2019). Bayesian inference of phylogeny of of atpD and recA sequences were estimated to study the taxonomic classification of Sesbania rhizobia, while the identification of the isolates at species level was evaluated using rrs plus rrl PCR-RFLPs and Sesbania isolates were identified as Mesorhizobium pluriformis or Rhizobium huautlense. The study revealed geographic distribution of M. pluriformis and the analysis showed R. galegae and R. huautlense belong to same lineages and synonym of R. gallicum, R. mongolense and R. yanglingense (Vinuesa et al. 2005).

The use of various molecular markers has greater ability to discriminate between species. The phenotypic and molecular characterization of the rhizobial isolates with fingerprint markers including BOX, ERIC, REP and BOX-PCR could discrimate the rhizobia from indigenous tree legumes (Mimosa tenuiflora, Piptadenia stipulacea and M. caesalpiniifolia). However, amplification technique by duplex PCR with nifH and nodC genes could result in false-positive data as these genes are highly pleomorphic between species and biovars. Therefore, it was discouraged and rather, use of larger molecular markers which could provide safer knowledge on the taxonomy and diversity of rhizobia was recommended (Lyra et al. 2019).

Taxonomy of rhizobia based of polyphasic approach

Polyphasic approach had been used as a powerful technique in identifying and resolving the Rhizobiaceae family (Cardoso et al. 2012). A combination of phenotypic and phylogenetic classification of 16S rRNA and 23S rRNA gene sequences in polyphasic approach were employed to classify rhizobia (Vandamme et al. 1996). This technique had provided in studying the generic relationships of Bradyrhizobium and Rhizobium (Graham et al. 1991), also, Azorhizobium was discreetly segregated with one species Azorhizobium caulidans (Dreyfus et al. 1988). The polyphasic study incorporates various other techniques and it was useful in identifying 52 rhizobia isolated from Acacia spp. and Sesbania spp. which could identify two clusters by SDS-PAGE, which were genotypically and phenotypically different belonging to Rhizobium meliloti and R. fredii and a third cluster was found to branch with R. loti. This polyphasic taxonomy was used to emend genus Sinorhizobium, which was previously classified as Rhizobium meliloti for Sinorhizobium meliloti com. nov. Further two other species of the genus namely, S. saheli and S. terungu were proposed for the strains isolated from Senegal (de Lajudie et al. 1994). Rhizobia that could nodulate wild legumes were classified using polyphasic taxonomy including other tools such as profiling fatty acid content with analysis of whole cell protein pattern that led to the classification of 20 strains into 12 strains of R. leguminosarum, 5 strains of S. meliloti and 3 strains of Rhizobium spp. (Zahran et al. 2003). Fatty acid methyl ester analysis (FAME) had been reported to use as a taxonomic marker for rhizobia classification and it is also considered as a part of polyphasic technique to identify a new species (Zahran 1997). Fatty acid profiles were used to classify 600 rhizobial strains belonging to genera Rhizobium, Agrobacterium, Sinohizobium, Bradyrhizobium, and Mesorhizobium (Tighe et al. 2000). Diouf et al. (2000) used polyphasic approach to classify 58 rhizobial strains isolated from West Africa and the different phenotypic and genotypic techniques employed led to the classification of isolates into two main groups that belong to R. tropici type B and R. etli. The isolates belonging to R. etli exhibited different electrophoretic type which was indicative of internal heterogeneity within the strains as analysed by multilocus enzyme electrophoresis (MLEE). The heterogeneicity was further examined by host-plant specificity, intergenic spacer region (ITS) PCR–RFLP, and SDS–PAGE which revealed genetic variation in the isolates. Using the polyphasic approach including phenotypic and genetic analyses, Pinto et al. (2007) characterized R. tropici strains from Brazil and found that the R. tropici strains consisted high variability in ribosomal genes, but higher similarity in nifH and nodC genes as confirmed by RFLP-PCR, with inference that there might be possibility to divide R. tropici into two different species (Pinto et al. 2007). Based on polyhasic approach, de Lajudie et al. (1998) detailed rhizobia into seven genera (Rhizobium, Allorhizobium, Bradyrhizobium, Azorhizobium, Mesorhizobium, Methlylorhizobium, Sinorhizobium). Indigenous rhizobial community chickpea had been reported to exhibit heterogeneity at different locations with different methods of characterization methods (Dudeja and Singh 2008; Nandwani and Dudeja 2009; Rai et al. 2012). Polyphasic approach has advantages in classifying microorganisms into precise genera and species, as it utilizes phylogenetic, phenotypic, genomic, and chemotaxonomic methods for characterization.

Taxonomy based on multilocus sequence analysis (MLSA)

As already explained, 16S rRNA based phylogeny exhibited low resolution among highly related species, as the gene sequences is too conserved for separation of closely related species. In such cases, MLSA of housekeeping genes aid in resolving taxonomic issues and discriminate the species into subspecies (Werner et al. 2015). Description of new genera upto species and sub species levels were provided by analysis of symbiotic genes such as nodulation genes (nodABCIJ), and nitrogen fixation genes (nifDK, nifH, fix and x) genes. Glutamine synthetase (GSI, GSII), recA and atpD that lead to appropriate taxonomy and systematics of rhizobia nodulating legumes (Zeze et al. 2001, Suominen et al. 2001; Ribeiro et al. 2009). MLSA analysis of four housekeeping genes (16S rRNA, atpD, recA and rpoB) supported the separation of Rhizobium giardinii which represents a novel genus Pararhizobium (Mousavi et al. 2015).

The MLSA was used to study the symbiovars (symbiotic variety) of Mesorhizobium nodulating chickpea. It revealed the existence of one new chickpea Mesorhizobium species and one novel symbiovar, M. opportunistum sv. ciceri by analysing phylogenetic relationship of core genes and nodC symbiotic gene (Laranjo et al. 2012). Based on the MLSA of six protein-coding housekeeping genes in 114 rhizobial taxa, novel species had been reclassified into different genera namely, Allorhizobium, Agrobacterium, Rhizobium, Pararhizhobium and Neorhizobium (Mousavi et al. 2014, 2015).

Omics technology in rhizobial taxonomy

Advances in whole genome sequencing techniques facilitate to classify rhizobia based on ANI of the genomes, and species of Rhizobium was found to be comprised of numerous genomic lineages (Acosta et al. 2011; Santamaria et al. 2017). Whole genomes enable reconstruction of phylogenomic trees on the basis of thousands of genes that represent evolutionary relationships that replaced phylogeny based on few markaers including 16S rRNA genes. Different strains of R. etli exhibiting low recombination rate indicated that distinguished genomic lineages could involve a given species or multiple species (Acosta et al. 2011).

Phylogenomic analysis of the genome sequence led to the identification of Allorhizobium and distinguished Agrobacterium from Rhizobiaceae family. Genome phylogeny had supported the inclusion of Rhizobium vignae in Neorhizobium group, although ANI values were found to be less than 91%, it was considered as Neorhizobium vignae. Further, this technique also revived Allorhizobium as a genus and included Allorhizobium vitis (formerly Agrobacterium vitis) and Allorhizobium taibaishanense (formerly Rhizobium taibaishanense). Also, closely related species of Rhizobium leguminosarum were found within tropici group and designed as Rhizobium rhizogenes which was previously known as Agrobacterium rhizogenes (Ormeno-Orrillo et al. 2015). Gonzalez et al. (2019) suggested that phylogenomic clades represent evolutionary continuum within the species defined by genomic clusters. This phylogenomic relationship based on core genome markers and complete sets of ribosomal proteins discovered the main lineages of Rhizobium.

New bioinformatics tools that reduce the technical confinements of classical DNA hybridization measurements to delineate prokaryotic species are now being utilized routinely. At present, the primary approach in the taxonomy of the rhizobia is based on genomic average nucleotide identity (ANI) between the genome sequences of the strains (Ormeno-Orrillo et al. 2015). This gives an array of sequence similarity between sets of genomes (designated the query and reference genome) and computes this value for areas in the genome. ANI values of 95–96% 16S rRNA gene sequence similarity have been described to delineate species-level similarities. ANI values of concatenated sequences of partial sequences of core genes are employed to delineate rhizobial species (de Lajudie et al. 2019). According to this criteria, nodulating bacteria R. aegyptiacum (Shamseldin et al. 2017), R. esperanzae (Cordeiro et al. 2017) and R. ecuadorense (Ribeiro et al. 2015) had been defined as species. However, it is noticable that ANI scores between a query and reference genome are regularly asymmetric considering contrasts in gene complements and genome sizes. This asymmetry is not completely surprising as it was regularly seen in reciprocal hybridization studies about utilizing marked DNA tests in the past. ANI also has restricted utility in characterizing species, subspecies, and strain-level relationships. It is suggestive of genomic clusters; its values can range within species that lead to division or fusion of species based on the cut-off used, therefore phylogenomic and genetic measures of population could delineate species significantly (Fraser et al. 2009).

Classification based on the whole genome sequence comparisons are termed as genotaxonomy. Rhizobium spp. nodulating common-bean and R. leguminosarum nodulating clover were comprised of diverse genomic clusters of related strains (Kumar et al. 2015; Perez-Carrascal et al. 2016). Based on the genomic comparison, common bean-nodulating rhizobial strains assigned to R. etli and R. phaseoli were suggested to be resembling in independent species within the same environment (Miranda-Sanchez et al. 2016; Santamaria et al. 2017). Gan et al. (2019) analysed the genome sequence of A. radiobacter NCPPB3001T and A. tumefacien B6T and compared with A. radiobacter LMG140T and determined that the type strains of A. tumefacien and A. radiobacter illustrate two subspecies from the same species.

Draft genome sequence of a rhizobial strain NAU-18T was reported to consist of 6588 protein-coding genes. Phylogenetic analysis showed the strain was similar with Neorhizobium alkalisoli CCBAU 01393T and Rhizobium oryzicola ZYY136T and clustered with R. oryzicola based on 16S rRNA gene sequences. The strain represented a novel species of Rhizobium and classified as Rhizobium terrae sp. nov. NAU-18T (Ruan et al. 2020). Gonzalez et al. (2019) studied the genomic clusters to establish the significance of phylogeny of Rhizobium at species level. Rhizobial species that resemble R. etli and R. leguminosarum were inversely correlated and displayed genomic clusters with ANI > 95%. The pan-genome of the Rhizobium revealed the presence/absence of the gene profiles both in chromosomes and plasmids that follow the phylogenomic pattern of species divergence which may be due to inter-strain gene transfer. Rhizobium genome cluster may be a part of evolutionary divergence for formation of species. Considering the dynamics of genome evolution in bacteria, accessory genes are the determining factor for adaptation and specialization. These genes comprise mobile genetic elements, including phages and transposons, which are generally termed as symbiosis-related genes. Genomic islands are the mobile elements that are flanked by tRNA genes (Young et al. 2006). The bacterial genome had revealed to have symbiosis islands which were closely related to Mesorhizobium loti of Phyllobacteriaceae family (Kaneko et al. 2000).

Also, lateral gene transfer has been suggested to play an important role for genome evolution in Agrobacterium/Rhizobium and Ensifer/Sinorhizobium (Young et al. 2006). It had been stated that stable taxonomy is specified by core genes present in the chromosome and involved in housekeeping processes. Also, the specificity of different host by same bacterial species is due to the presence of different accessory genes; in case of rhizobia, or “nodulation genes”, which determine the host specificity (Young et al. 2006). The accessory genes aid in discriminating closely related species, while other core genes recA and atpD had been used to specify relationship among different mesorhizobia. These accessory genes deliver important properties other than nodulation such as pathovar (that defines specificity of the plant pathogen), serovar (that defines antigenic properties of cell surface of the bacteria (Berrada and Fikri-Benbrahim 2014). Comparative genomic analysis of 29 rhizobia (21 Rhizobium, 4 Ensifer, 4 Bradyrhizobium) showed horizontal gene transfer ensued at plasmid despite the high plasticity of symbiosis genes. This revealed symbiosis and housekeeping genes played important role in rhizobial evolution that led to expand the diversity of bean-nodulating Rhizobium strains. Further phylogenetic analysis of 191 HGT genes showed consistent in the taxonomy of bacterial species. Dispersion of symbiosis genes was suggested to be unusual between rhizobial genera whereas within the same genus expansion of genes was common that could result in formaton of multi-symbiovars. Comparative genomic analysis of Ensifer and Bradyrhizobium exhibited diverse symbiotic regions and had shown symbiotic compatibility between soybean and common bean microsymbionts (Tong et al. 2020).

Comparative genome analysis of strains of Rhizobiaceae family had indicated replicons varied involving single chromosomes, extrachromosomal replicons (ERs) (or chromids) and plasmids (Slater et al. 2009). ERs genes are genus-specific genes that functions as accessory activities (Harrison et al. 2010). The chromids in Agrobacterium/Rhizobium and Ensifer/Sinorhizobium genomes represented half of these genomes. It had been stated that the nodulation genes and genes for nitrogen fixation may perhaps reside in these chromids (Lopez-Guerrero et al. 2012a, b; Althabegoiti et al. 2014) and their presence make these species capable to grow faster in culture (Harrison et al. 2010).

Taxonomic classification based on whole-genome sequence, core genome phylogeny, and chemotaxonomic comparison of group of Rhizobium species had resulted in a novel genus—Pseudorhizobium. This led to the reclassification of Rhizobium flavum, R. endolothicum, R. halotolerans, R. marium, as P. flavum comb. nov., P. endolothicum comb. nov., P. halotolerans sp. nov., and P. marium comb. nov. respectively. Resolution of taxonomic classification was improved and supported by genomic basis of phenotytic traits, fatty acid, protein, and metabolic profiles. Phylogenetic analysis of the pan-genome of Pseudorhizobium indicated divergence of each species within this genus to adapt their ecological niches (Lassalle et al. 2021). Bradyrhizobium and Azorhizobium of α-rhizobia has single chromosome (Kaneko et al. 2002; Lee et al. 2008) while Mesorhizobium have megaplasmid along with the chromosome (Kaneko et al. 2000). Sinorhizobium and Rhizobium have highly divided genome structures i.e., R. leguminosarum harbors seven replicons (Young et al. 2006) whereas S. meliloti genome has more than half size on the chromosome. α-Proteobacteria rhizobia genomes follow the phylogenetic relatedness of these species (Galibert et al. 2001). Pan-genome could indicate the genomic intraspecies diversity (Vernikos et al. 2015) and rhizobia have been reported to have large pangenomes that comprise of thousands of genes that contributed to the phenotypic diversity of the rhizobia. The pan-genome of S. meliloti had over 20,000 genes (Sugawara et al. 2013) and Bradyrhizobium had 35,000 genes (Tian et al. 2012). The location of classical symbiotic genes (i.e., nif, nod and fix genes) had used as a genotypic tool to classify fast- and slow-growing species of Rhizobium. These genes are found on the large symbiotic plasmids or megaplasmids in α- and β-rhizobia (Rosenberg et al. 1981; Teamtisong et al. 2013).

Kuzmanovic et al. (2022) proposed delineation of genus of family Rhizobiaceae, in which genera were separated from related species utilizing core-proteome average average amino acid identity (cpAAI) and the genera were defined as monophyletic groups based on core genome phylogeny. They proposed that genomic or phylogenetic data could help in division of species into separate genera and reclassified Rhizobium rhizosphaerae and R. oryzae into Xaviernesmea gen. nov. The study also provided data for the formation of Endobacterium yantingense comb. nov., Mycoplana azooxidifex comb. nov. Neorhizobium petrolearium comb. nov., Pararhizobium arenae comb. nov., Peteryoungia aggregate comb. nov., Pseudorhizobium tarimense comb. nov. Using genomic, phenotypic data, and cpAAI values (> 86%) of all Ensifer and Sinorhizobium species, they proposed to consider these two genera as separate genera. Previously, ANI values of strains of Ensifer fredii USDA 257 and NGR 234 were reported to be low as compared with type strains of E. fredii and other Sinorhizobium americanum strains which indicated that NGR 234 corresponds to a separate species (Lloret et al. 2007).

Current classification of rhizobia

Most of the rhizobia bolong to the class α-proteobacteria with wide distribution among the host plants, β-proteobacteria are mainly isolated from root nodules of Mimosa sp. (Liu et al. 2020). Alpha-proteobacteria of Rhizobiaceae family are diverse and has undergone several revisions and recently 21 genera consisting of Allorhizobium, Agrobacterium, Carbophilus, Cicerbacter, Ensifer, Endobacterium, Georhizobium, Gellertiella, Hoeflea, Liberibacter, Lentilitoribacter, Mycoplana, Martelella, Neorhizobium, Neopararhizobium, Pseudorhizobium, Peteryoungia, Rhizobium, Sinorhizobium, Shinella, Xaviernesmia has been classified (Kuzmanovic et al. 2022), (https://lpsn.dsmz.de/). Taxonomical description of various rhizobial genera that forms nodules on different hosts are enlisted in Table S2. The different genera of rhizobia which are able to induce nodulation in their respective hosts are discussed below in Table 1.

Table 1 Different genera of classified rhizobia

Today, taxonomic classification of bacteria is based on accessible genomic data of sequenced prokaryotic genomes. A decade back, genome sequencing remained costly and tedious, however the advent of NGS strategies presented after 2005 has made it a lot less expensive and quicker. The genome sequences deposited in public database are easily available for phylogenomics and therefore, overall genome based indices has replaced DNA-DNA hybridization (DDH) for its low cost and quality of genomic information (Sentausa and Fournier 2013). As explained above, this has also impacted the taxonomy of rhizobia. Parks et al. (2018) had recently proposed a standardized bacterial taxonomy (GTDB taxonomy, http://gtdb.ecogenomic.org/), which is based on phylogeny of bacterial genomes, by analyzing the amino acid sequences of 120 proteins encoded by 120 universal genes. While this strategy utilized concatenated protein phylogeny for prokaryotic classification, which conservatively removes polyphyletic groups, it would be interesting to see if this may resolve up to inter-genus level, as in this case, for the diversity for rhizobia.

Conclusion

Constant development in identification of new legume nodulating bacteria resulted in considerable changes in the taxonomy and nomenclature of rhizobia. Phylogenetic studies using the 16S rRNA gene determine the taxonomic position of rhizobia, while polyphasic approach were used as it became the most reliable method that delineate at species level. Sequence analysis of 16S rRNA, 16–23S rRNA and other housekeeping genes, advances in molecular biology techniques and the use of bioinformatics techniques have facilitated to identify, classify, and discriminate rhizobia to species and subspecies levels. Most of the symbiotic nitrogen fixing bacteria belongs to the main Phylum Proteobacteria of which α-Proteobacteria are most widely distributed in the environment and host plants, while β-Proteobacteria are less widely distributed and found in specific legumes and γ-Proteobacteria are reported for some isolates in temperate legume tree. Genomics analyses have revolutionized which deliver a significant impact in the rhizobial taxonomy. The rhizobial genomes harbors whole spectrum from unichrosomal to highly multipartite, while some strains encode single chromosome and a megaplasmid as well. This approach could describe the main features of Rhizobiaceae genomes, bacterial chromid/ER gene, plasmids, and significance of horizontal gene transfer. The genetic material and genome organization of rhizobia represent evolutionary process of multipartite genomes, which would deliver valuable models for understanding the significance of genome organization in environment adaptation. Comparative genome sequence analysis coupled with ANI could describe new species and it has completely replaced the wet lab DDH values in species characteriation. For accurate characterization of taxonomy, it is better to characterize with different parameters such as phenotypic, genotypic, chemotaxonomic as well as genome sequence analysis.