Introduction

Several plant species belonging to the N-fixing clade, a monophyletic group of angiosperms, are able to establish N-fixing symbiosis in root-derived nodule cells with different genera of the bacteria domain. Both the origin and the complex diversification of N-fixing symbiosis have puzzled biologists for a long time and are still issues under debate (Doyle 2011; Geurts et al. 2016; Griesmann et al. 2018). The dispersed occurrence of N-fixing symbiosis in the N-fixing clade can be explained by the non-homologous origins of N-fixing symbiosis (Svistoonoff et al. 2013; Werner et al. 2014). This point of view is coherent with the predisposition hypothesis, which states that the common ancestor of the N-fixing clade evolved a predisposition towards the evolution of N-fixing symbiosis (Soltis et al. 1995). Under this paradigm, which is in agreement with the traditional concept of deep homology (Shubin et al. 2009), in various lineages of the N-fixing clade, the emerged predisposition has been largely retained in a non-N-fixing symbiosis role (e.g., stress response), and the non-homologous origins of N-fixing symbiosis represent convergent evolutionary events selected by the limitation of fixed nitrogen in many ecosystems (Doyle 2016). This is in accordance with the emergence of complex traits through incrementally key transitions (Oyserman et al. 2016), and with the extraordinary diversity and ecological success of plant species that obtain fixed nitrogen from N-fixing symbiosis (Lavin et al. 2005). Evolutionary reconstructions are normally based on observations, and not on the design and performance of experiments, and consequently, different scenarios for very early evolution events are, by nature, speculative (Cleland and Xa 2002). Exceptionally, the predisposition hypothesis makes predictions that are testable and refutable. Specifically, this hypothesis indicates that it is possible to identify a cryptic evolutionary innovation that drives symbiotic N-fixation evolution by genomic comparisons (Werner et al. 2014), and to then experimentally recreate the acquisition of N-fixing symbiosis by transfer of this genetic novelty to non-N-fixing plants, including major cereal crops. Therefore, the identification of key factors for the emergence of N-fixing symbiosis not only is able to bypass the typical epistemological limitations of evolutionary reconstructions, but also has large ecological and economic consequences.

During N-fixing symbiosis, rhizobia are differentiated from bacteroides, which are N-fixing ammonium-excreting forms of rhizobia (Jones et al. 2007). Bacteroides inside infected root cells are surrounded by a symbiosome membrane, derived from the plant cell plasma membrane, which effectively segregates bacteroides from the host cytoplasm and controls the movement of metabolites between the two symbiotic partners. Rapid movement of ammonium from the bacteroid to the plant cytoplasm is facilitated by the nodule-specific ammonium channel NOD26 (Hwang et al. 2010), the main protein component of the symbiosome membrane in soybean (Glycine max) (Fortin et al. 1987). Since ammonium is the principal beneficial compound obtained by plants during rhizobial symbiosis and because its rapid uptake by plants is the outcome of NOD26 activity, this channel is a good candidate to be a key molecular innovation for the emergence of N-fixing symbiosis.

Methods

Search of Protein Databases and Identification of NOD26-Related Proteins

AQP protein sequences of Arabidopsis thaliana (AtPIP1;1, AtTIP1;1, AtNIP1;1, and AtSIP1;1) and Physcomitrella patens (PpGIP1;1 and PpXIP1;1) were used as query to search against genome databases, including Phytozome (https://phytozome.jgi.doe.gov/pz/portal.html), PlantGDB (http://www.plantgdb.org/), NCBI (https://www.ncbi.nlm.nih.gov/), PLAZA (https://bioinformatics.psb.ugent.be/plaza/) by using BLASTP (amino acid identity cut-off ≤ 15%). Thirty green plant genomes, including genomes from unicellular organisms (Volvox carteri, Coccomyxa subellipsoidea, Ostreococcus lucimarinus, and Chlamydomonas reinhardtii) as well as genomes from multicellular eukaryotes (Glycine max, Phaseolus vulgaris, Medicago truncatula, Lotus japonicus, Cucumis sativus, Fragaria vesca, Prunus persica, Malus domestica, Ricinus communis, Morchella esculenta, Populus trichocarpa, Linum usitatissimum, Arabidopsis thaliana, Solanum tuberosum, Theobroma cacao, Vitis vinifera, Paspalum virgatum, Scilla italica, Zea mays, Oryza sativa, Brachypodium distachyon, Picea abies, Picea sitchensis, Cryptomeria japonica, Selaginella moellendorffii, and Physcomitrella patens), were explored for the presence of AQPs. Proteins that were not identified by the BLASTP analysis but are within the genomes of these green plants and contain the PF00230 domain (https://pfam.xfam.org/family/PF00230) were also selected.

Synteny, Phylogenetic, and Expression Analyses

Genomic synteny comparisons among chromosomes of plant genomes were performed for dicot species, including species belonging to the N-fixing clade (Lotus japonicus, Medicago truncatula, Phaseolus vulgaris, Glycine max, Arachis ipaensis, Parasponia andersonii, Trema orientalis, and Cucumis sativus), as well as Ricinus communis (a close-related species to N-fixing clade), and the model plant Arabidopsis thaliana, by using the LASTZ plugin in the Geneious v10.1 software (Kearse et al. 2012). A strict criterion to identify homologous proteins among chromosomes was used (> 50% of amino acid identity). Protein identities were calculated using MatGAT (Campanella et al. 2003). AQPs were phylogenetically and molecularly analyzed using MEGA version 7.0. (https://www.megasoftware.net/). Protein sequences were aligned using the ClustalW program. Phylogenetic trees were constructed using the Neighbor-Joining (NJ) method with genetic distances computed using the p distance model and bootstrap analysis of 500 resamples and root on midpoint. Microarray gene expression data were collected from Mt-Atlas (Benedito et al. 2008), Gm-Atlas (Libault et al. 2010), LjGEA (Verdier et al. 2013), PvGEA (O’Rourke et al. 2014), Ai-Atlas (Chopra et al. 2014), and Pa-Atlas (van Velzen et al. 2018).

Results and Discussion

NOD26 is a member of the aquaporin (AQP) superfamily, a group of highly diversified channel proteins that facilitate the bidirectional transport of water, small solutes, ions, and/or dissolved gases across cell membranes in passive manner (Perez Di Giorgio et al. 2014). To begin to analyze the origin of NOD26, in the present study, we examined the phylogenetic relationships of AQPs from 30 whole genome-sequenced species of plants representing major branches of the plant kingdom. As expected for being the archetype, NOD26 clustered within the NIP (NOD26-like intrinsic proteins) AQP subfamily (Fig. 1), and not into the TIP, PIP, XIP, SIP, and GIP AQP subfamilies. In addition, NOD26 showed high phylogenetic affinity with the angiosperm-exclusive subgroup NIP 1 (Fig. 1), which is consistent with the need to have NOD26 to evolve N-fixing symbiosis. Thus, N-fixing symbiosis might not have emerged in ancestral green plants because these plants did not have NOD26.

Fig. 1
figure 1

NOD26 belongs to NIP 1, an angiosperm-exclusive subgroup of the NIP AQP subfamily. Large-scale phylogenetic analysis of NOD26 from Glycine max (boxed) based on the NJ method. The GIP, SIP, XIP, PIP, TIP, and NIP subfamilies of AQPs with focus on the classical NIP 1 to NIP 4 subgroups of NIPs (bold types) and the taxonomy of species (different colors) are shown. Angiosperms: Fabids (orange): Lotus japonicus, Medicago truncatula, Phaseolus vulgaris, Glycine max, Cucumis sativus, Ricinus communis, Fragaria vesca, Prunus persica, Malus domestica, Morchella esculenta, Populus trichocarpa, and Linum usitatissimum; Malvidae (red): Arabidopsis thaliana, Solanum tuberosum, Theobroma cacao, and Vitis vinifera; Panicoideae (light blue): Paspalum virgatum, Scilla italic, and Zea mays; Ehrhartoideae (blue): Oryza sativa and Brachypodium distachyon; Ancestral green plants: Gymnosperms (green): Picea abies, Picea sitchensis, and Cryptomeria japonica; Spikemosses (gray): Selaginella moellendorffii; Mosses (fuchsia): Physcomitrella patens; and Chlorophyta (black): Volvox carteri, Coccomyxa subellipsoidea, Ostreococcus lucimarinus, and Chlamydomonas reinhardtii. (Color figure online)

To better understand when NOD26 appeared into the angiosperm lineage, we restricted the evolutionary studies to well-characterized species of the Fabids clade and the model species Arabidopsis thaliana belonging to the Fabids sister clade Malvids (Fig. 2). Regarding the Fabids clade, we selected both N-fixing species (Lotus japonicus, Medicago truncatula, Phaseolus vulgaris, Glycine max, Arachis ipaensis, and Parasponia andersonii) and non-N-fixing species (Trema orientalis and Cucumis sativus) from the N-fixing clade, and their related species Ricinus communis from the Malpighiales (Fig. 2). N-fixing species were selected based on the availability of both robust genomic sequences and global expression analyses which endorsed the integration of bioinformatic and empirical studies. Comparison of the nucleotide sequence of the region around the NOD26 (Glyma08g12650) of G. max with other genomes of phylogenetically related plant species showed a conserved synteny without large-scale DNA rearrangements (Fig. 3). This chromosomal region, which we named the Biological Nitrogen Fixation (BNF) region, is duplicated in L. japonicus, M. truncatula, P. vulgaris, and A. ipaensis, and quadrupled in G. max, which is in accordance with ancestral and recent events of genome duplications in the ancestors of Legumes and G. max, respectively (Bertioli et al. 2009; Schmutz et al. 2010). In addition to the increase in the NOD26 copy number (paralogous genes) in Legumes via block duplication events, tandem duplications of NOD26 were observed in the BNF regions of species belonging to the families Legumes (L. japonicus, M. truncatula, P. vulgaris, G. max, and A. ipaensis) and Rosales (P. andersonii and T. orientalis), but not in C. sativus, which belongs to the family Cucurbitales, and control species R. communis and A. thaliana (Fig. 3). In agreement with the predisposition hypothesis, the most parsimonious explanation for the synteny results is the occurrence of a single AQP in the BNF region in ancestral angiosperms, the emergence of NOD26 in the N-fixing clade by a tandem duplication event, the amplification of NOD26 in the N-fixing clade by both block and tandem duplication events, and the loss of NOD26 in the ancestor of C. sativus. Analogously, we have recently reported the emergence of the pollen-specific AQPs NIP4;1 and NIP4;2 in angiosperms by a tandem duplication event, which is associated with accelerated pollen tube growth rates (Di Giorgio et al. 2016), a critical novelty for the rapid diversification of angiosperms (Williams 2008).

Fig. 2
figure 2

Schematic representation of the phylogenetic relationship among angiosperm species. This scheme shows the largely accepted evolutionary relationship among different angiosperm species, where the monophyletic N-fixing clade, which contains both N-fixing (Lotus japonicus, Medicago truncatula, Phaseolus vulgaris, Glycine max, Arachis ipaensis, and Parasponia andersonii) and non-N-fixing species (Trema orientalis and Cucumis sativus), is closely related to Ricinus communis and related to the model species Arabidopsis thaliana

Fig. 3
figure 3

Emergence of NOD26 in the N-fixing clade and its large expansion in Legumes. The NOD26 from Glycine max is situated within a conserved region of the angiosperm genome. This genomic region, which is particularly expanded in Legumes by duplicated blocks on the different chromosomes (Ch6 and Ch4 in Lotus japonicus; Ch2 and Ch8 in Medicago truncatula; Ch6 and Ch2 in Phaseolus vulgaris; Ch15, Ch13, Ch5, and Ch8 in Glycine max; and Ch10 and Ch3 in Arachis ipaensis), contains only one AQP in control species related to the N-fixing clade (Ricinus communis and Arabidopsis thaliana), and large tandem duplications of AQPs in all species belonging to the N-fixing clade (Lotus japonicus, Medicago truncatula, Phaseolus vulgaris, Glycine max, Arachis ipaensis, Trema orientalis, and Parasponia andersonii), with the exception of the non-N-fixing species Cucumis sativus. Orthologous genes are shown in the same color (orange = Pfam:00810; green = Pfam:02469; light blue = Pfam:04564; pink = Pfam:04844; dark blue = Pfam:00447; light yellow = Pfam:00230; white = hypothetical protein; gray = Pfam:06775; yellow = Pfam:11721; purple = Pfam:01031; brown = Pfam:00854; very light blue = Pfam:01680; light green = Pfam:01486; dark gray = Pfam:00657; olive green = Pfam:01823; blue = Pfam:07823; light orange = Pfam:00931; very light green = Pfam:01398; dark green = Pfam:03031; dark pink = Pfam:01363; ocean blue = Pfam:05915; cinnamon = Pfam:00069; red = Pfam:00012; light gray = Pfam:00560; light brown = pfam09262). The orthologous block of genes is represented with gray shadows. Arrows and squares represent gene orientation and transposons, respectively. The genomic blocks containing NOD26 and their related AQPs, which is named the BNF (Biological Nitrogen Fixation) region, are shown in light yellow, whereas homologous genomic blocks upstream and downstream of the BNF region are shown in gray. In the bottom panel, schematic representations of both block and tandem duplications are shown. (Color figure online)

Although the synteny analysis is consistent with the predisposition hypothesis, an integrated phylogenetic and expression analysis is important to support the origin of NOD26 in the N-fixing clade, and necessary to better understand the diversification of NOD26 within Legumes. In line with this, we examined the expression and evolution patterns of all paralogous copies of NOD26 included in the BNF regions of N-fixing species (Table 1; Fig. 4). The expression analysis showed that at least one copy of NOD26 is highly upregulated in inoculated roots or nodules of Legumes (LJ4g029020 in L. japonicus, Medtr8g087710 and Medtr8g087720 in M. truncatula, Phvul.002G242300 in P. vulgaris, Glyma08g12650 in G. max, and LOC107621874 in A. ipaensis) and non-Legume (PanWU01 × 14_248030 in P. andersonii) N-fixing species, supporting an essential and ancestral role of NOD26 in N-fixing symbiosis (Table 1). Considering that these active copies of NOD26 are in different localizations within BNF regions in Legume species (Fig. 3), we inferred the coexistence of different active copies of NOD26 in the ancestor of Legumes, and the inactivation of most of these copies during the evolution of the N-fixing clade. On the other hand, refined phylogenetic analysis supported the emergence of NOD26 in the N-fixing clade, and clarified the order of NOD26 duplications in Legumes (Fig. 4). Specifically, the phylogenetic tree suggests that the block duplication event (originating clusters G1.1.1 and G1.1.2) preceded the tandem duplication event (originating clusters G1.1.2A and G1.1.2B), indicating that genome duplication in the common ancestor of Legumes could be critical for the consolidation of a stable N-fixing precursor. In addition, clusters G1.1.2A and G1.1.2B exhibited a complete congruence with the organismal tree at species levels, whereas cluster G1.1.1 showed congruence only with the evolution of the CA- and D-clades of Legumes (Fig. 4). This partial congruence pattern has been described in other AQP clusters and is normally the consequence of the differential and large duplications of paralogous copies within different plant species or plant lineages (Abascal et al. 2014). In the case described here, the presence of several transposons next to the AQP grouping within cluster G1.1.1 (Fig. 3) suggests that the high rate of duplication of these AQPs could be mediated by these mobile elements. Figure 5 schematically summarizes the integration of syntenic, gene expression, and phylogeny analyses of all the AQPs localized within the BNF region in species of the N-fixing clade and in their phylogenetically related species R. communis and A. thaliana, focusing on the emergence of NOD26 in the N-fixing clade and highlighting the prediction of at least three functional copies of NOD26 in the ancestor of Legumes.

Table 1 Expression of AQPs within the BNF region in different tissues of N-fixing plants
Fig. 4
figure 4

Origin of NOD26 in the N-fixing clade and its complex evolution in Legumes. Phylogeny of all AQPs included in the BNF region from angiosperms based on the NJ method. Bootstrap percentages are indicated at the branch points (only values above 50% are indicated). Current NOD26 copies highly upregulated in inoculated roots or nodules in N-fixing species (LJ4g029020 in L. japonicus, Medtr8g087710 and Medtr8g087720 in M. truncatula, Phvul.002G242300 in P. vulgaris, Glyma08g12650 in G. max, and LOC107621874 in A. ipaensis and PanWU01 × 14_248030 in P. andersonii, see Table 1) and branches connecting these sequences are highlighted, showing the origin of NOD26 in the N-fixing clade. The tree topology shows that the increase in the NOD26 copy number in Legumes occurred firstly by block duplication (clusters G.1.1.1 and G.1.1.2) and secondly by tandem duplication (clusters G.1.1.2A and G.1.1.2B). A congruent pattern between orthologous clusters and organisms (see Fig. 2) is found on the right. Whereas clusters G.1.1.1 and G.1.1.2 showed a complete congruence with the organismal tree, cluster G.1.1.1 showed congruence only with the CA- and D-clades, which is in accordance with the particular expansion of AQPs in cluster G.1.1.1 probably mediated by mobile elements (see Fig. 3)

Fig. 5
figure 5

Schematic representation of the evolution of angiosperm species and their BNF region containing different aquaporins. The integration of synteny (see Fig. 3), gene expression (see Table 1), and phylogeny (see Fig. 4) analyses of AQPs within the BNF region from angiosperm species suggests the origin of NOD26 in the N-fixing clade by tandem duplication, a particular expansion of NOD26 in legumes by both block and tandem duplications, and the loss of NOD26 in the non-N-fixing species Cucumis sativus. Similar to that shown in Fig. 4, current NOD26 copies highly upregulated in inoculated roots or nodules in N-fixing species and branches connecting these sequences are highlighted

After defining that NOD26 emerged in the N-fixing clade via tandem duplication and was consolidated in Legumes via block and tandem duplications, and under the premise that this symbiotic ammonium channel has an angiosperm origin related to non-symbiotic plant-beneficial functions, we here recreated the genomic and functional transitions that can lead to the emergence of the original N-fixing precursor and its stabilization in Legumes (Fig. 6). Under aerobic conditions, root respiration provides the proton-motive force for the flux of large solutes, including sugars and nitrogen compounds (Fig. 6a). Under anaerobic conditions, angiosperm roots maintain some ATP synthesis through catabolic processes, including lactate and ammonium fermentations (Fig. 6a). Thus, the expression of genes involved in this type of anaerobic processes, such as lactate dehydrogenase (Shingaki-Wells et al. 2014) and nitrate reductase (Allegre et al. 2004), is a critical factor for the tolerance of plants to root anoxia. However, under long-term root anoxia, fermentation products are accumulated in the cytoplasm, reducing the rate of glycolysis and further decreasing ATP production, which can cause an energy crisis and consequently cell death. These lethal effects can be accelerated by the intrinsic toxicity of ammonium on cytoplasmic enzymes (Esteban et al. 2016) (Fig. 6a).

Fig. 6
figure 6

Metabolic reconstructions of the origin of the N-fixing precursor in angiosperms and the hypothetical ancestral non-symbiotic role of NOD26 in ammonium detoxification in anoxic roots. This picture is largely explained in the text

During the energy crisis induced by anoxia, the cost of the active export of fermentation products is unsustainable, and, consequently, a selection pressure is exerted for the emergence of passive channels (such as AQPs) induced in anaerobic root cells able to facilitate the diffusion of these reduced compounds in favor of the concentration gradient (from the cytoplasm to the extracellular space). Arabidopsis AQP AtNIP2;1, a lactate channel specifically induced in root cells during anoxic stress (Choi and Roberts 2007), and the ammonium channel NOD26 probably emerged under this selective pressure, improving the NAD+ recycling necessary to continue with the glycolysis process (Fig. 6a, b). Interestingly, the gene that codifies for AtNIP2;1 (AT2G34390) is within the BNF region (Fig. 3), and, consequently, AtNIP2;1 is the most well-characterized orthologous protein of NOD26 in angiosperms. Considering that the phylogenetic analysis supported the emergence of AtNIP2;1 before that of NOD26 (Fig. 4), we can explain the retention, long-term persistence, and conservation of NOD26 in the N-fixing clade as a mechanism to maximize the amount of recycled NAD+, improving the tolerance to anaerobic stress (Fig. 6b). It is important to note that the emergence of an ammonium channel induced in anaerobic roots, i.e., the critical evolutionary innovation for the emergence of N-fixing plants proposed in this work, needs few evolutionary events, which consist of a tandem duplication of an AQP (including its promoter) and low amino acid (80% similar) and functional (small protonated solutes) divergences between homologous protein AtNIP2;1 and NOD26 in the ancestor of the N-fixing clade (Fig. 6b). In addition, AtNIP2;1 and NOD26 show the same pattern of transport activation induced by acidification (e.g., from pH 7.6 to pH 5.0–6.0) (Choi and Roberts 2007), which is consistent with the rapid cytoplasmic acidification during anoxic stress and the relevance of exporting fermented products under this stress condition (Felle 2005).

Returning to the general aspects of the evolutionary reconstruction of N-fixing symbiosis genesis, the evolution of the original N-fixing precursor (Fig. 6b) would have resulted in stable (Legumes) and unstable (non-legumes) N-fixing precursors, which differ mainly in the number of copies of NOD26 (Fig. 6c). Specifically, the ancestor of Legumes increased the number of copies of NOD26, thus improving the stability of this key genetic preadaptation for the emergence of N-fixing symbiosis (Fig. 6c, d), whereas non-legume lineages maintained only the original number or a low number of copies of NOD26, and therefore, were more susceptible to lose this N-fixing symbiosis prerequisite by chance (Fig. 6c, d). Lastly, different lineages from the N-fixing clade that retained at least one copy of NOD26 were able to acquire N-fixing symbiosis independently, whereas this acquisition is negatively selected in plants from the N-fixing clade that lost NOD26 (Fig. 6d). In this context, the probable induction of cell division by ancient N-fixing symbionts (Geurts et al. 2016), with the consequent decrease in free oxygen in root cells (Ott et al. 2005), possibly played a critical role in the initial induction of NOD26 expression.

Considering that bacterial N-fixation is an anaerobic process and that common free-living N-fixing strains do not release the fixed nitrogen to the environment, the emergence of the current and well-known ammonium-excreting N-fixing symbiotic rhizobia seems to need the same single evolutionary innovation as their plant hosts: the acquisition of an efficient ammonium channel. Horizontal gene transfer is the main mechanism of bacterial evolution, and genes related to the N-fixation process in rhizobia are normally within conjugative elements able to convert non-symbiotic strains into N-fixing symbionts (Haskett et al. 2016). For example, the genome of the alfalfa-symbiont model strain Sinorhizobium meliloti 1021 includes three replicons, consisting of a circular chromosome (SMc) encoding housekeeping functions, the rhizosphere colonization-related megaplasmid B (SMb), and the N-fixation-related megaplasmid A (SMa) (Galardini et al. 2013). Within the SMa replicon, we found an AQP-like protein produced by SMa0627, which is very similar to NOD26 (68% of amino acid similarity) and is next to fix genes (SMa0612–SMa0626). Considering that fix genes are essential factors for N-fixing symbiosis (Bobik et al. 2006) and that bacterial genes involved in the same pathway are normally co-localized in the genome, SMa0627 is a natural candidate to promote the release of ammonium from bacteria to plants during N-fixing symbiosis.

In biotechnological terms, the results here presented suggest that the NOD26 from N-fixing plants, and maybe the NOD26-like AQPs from their N-fixing symbiont, could be useful tools to improve N-fixation in N-fixing plants and to transfer N-fixation to non-N-fixing plants such as cereals. Constitutive expression of active ammonium channels in almost any microorganism or plant tissue, including rhizobial and root cells, has probably a strong negative fitness. Consequently, ammonium channels for microbial and plant engineering could be used with specific anaerobic-induced promoters, such as the model non-N-fixing plant Arabidopsis expressing NOD26 from G. max under the control of the AtNIP2;1 promoter from Arabidopsis. Since there is a strong framework for the deregulation of transgenic crops, these types of transgenic plants with greater capacity for ammonium interchange under anaerobic conditions could be used to improve the uptake of ammonium produced by N-fixing bacteria in future commercial crops. Contrary to sexual organisms like animals and plants, which exhibit a robust containment of transgene flow, microorganisms show extremely promiscuous gene flow, and consequently, there are no genetically modified bacteria in the market. However, domesticated mutant plant-growth-promoting strains are commonly accepted as inoculants to improve crop production. In this framework, and considering that the substitution of a single amino acid residue can alter the transport profiles of some AQPs (Azad et al. 2012; Krenc et al. 2014), we propose bacterial genome editing with CRISPR-Cas9 over current AQPs to mimic the ammonium channel, as a strategy to produce novel ammonium-excreting N-fixing non-transgenic inoculants to improve the nitrogen content in legumes and cereal crops. Finally, future characterizations of transgenic plants and edited N-fixing bacteria overexpressing ammonium channels in roots under anoxia can contribute not only to improve crop production but also to empirically recapitulate the origin of N-fixing symbiosis.