Introduction

A critical goal in bacteriology is to understand patterns of genotypic abundance and epidemic spread. Of particular interest are host-associated bacteria, including pathogens and symbionts. These diverse bacterial lineages colonize host surfaces, inhabit specific tissues or cells, and can often persist free in soils and or aquatic habitats between phases of host infection [1, 2]. The capacity of bacteria to thrive in host tissues is often modulated by the presence of plasmids and genomic islands, cassettes of accessory loci that can get transmitted among genomes. Horizontal transfer of these accessory loci often engenders bacteria with suites of fitness-enhancing traits including host infection capacity, pathogenicity, multidrug resistance, and metabolic flexibility [35]. The acquisition of plasmids and genomic islands has been implicated in epidemic outbreaks, including Pseudomonas aeruginosa, Staphylococcus aureus, and Yersinia spp. [68]. But in natural settings, we understand little about how bacterial strains vary in their capacity to dominate local sites or host populations or to spread among sites across ecological barriers. In particular, almost nothing is known about patterns of dominance and epidemic spread in symbiotic bacteria, which are important for human health, the success of crops, and other ecosystem services.

Rhizobia are proteobacteria that can exhibit capacity to infect leguminous plants and fix atmospheric nitrogen for their hosts [9]. Globally, rhizobia are responsible for the fixation of ~150 Tg of nitrogen per year [10] and their symbiosis with legumes represents the largest input source of nitrogen into terrestrial ecosystems [11]. In agriculture, legumes account for ~27 % of global crop production [12] and are valued for their capacity to grow in nitrogen depauperate soils. Similar to bacterial pathogens, rhizobia can acquire accessory DNA elements that confer the capability to colonize and infect hosts. Rhizobial genomes are subdivided into genome regions specific for their life stages, with chromosomal loci expressed during free-living phases in the soil and symbiosis loci expressed inside of host cells [13, 14]. Symbiosis loci required for host nodulation and nitrogen fixation are clustered onto large plasmids or genomic islands [1519], that can be transferred among lineages, presumably via conjugation [2022]. Non-nodulating rhizobia are also common [23, 24], and these strains often lack some or all of the characterized symbiosis loci [2328].

Bradyrhizobium is the most cosmopolitan rhizobial lineage and is found free-living in soils and in aquatic environments as well as in symbiotic association with plant and animal hosts, including humans [2937]. Recent work suggests that there are ~19 species of Bradyrhizobium that form symbiotic associations with legumes [36], although non-symbiotic strains are also common in soils [37]. Bradyrhizobium can nodulate diverse wild legumes as well as staple crops such as soybeans (Glycine), peanuts (Arachis), and cowpea (Vigna) [38, 39]. In the model genome, Bradyrhizobium diazoefficiens strain USDA110 (previously B. japonicum), symbiosis-specific genes are clustered within a 410-kb symbiosis island in which the G + C content differs from the rest of the genome [40]. Some divergent Bradyrhizobium spp. (e.g., BTAi1 and ORS278) lack a symbiosis island and use a different mechanism to nodulate hosts [41]. Previous studies have revealed that Bradyrhizobium populations can exhibit patterns consistent with epidemics such as a few genotypes existing at high frequency at a single site [33] or genotypes isolated at multiple locations [34, 35, 37]. But these patterns remain poorly understood, and it is unclear what role the symbiosis island might play as a driver of increased abundance and epidemic spread.

Here, we investigated the population genetic structure of Bradyrhizobium spp. cultured from Acmispon strigosus (formerly Lotus strigosus), a native annual legume common across the Pacific Southwest of the United States. We cultured 850 A. strigosus nodules from 14 natural sites across California encompassing 185 plants collected over a >840-km transect. In parallel, we isolated 442 root surface Bradyrhizobium from three focal host populations within this range, which includes isolates that lack symbiosis islands and cannot infect Acmispon hosts. All 1292 isolates were sequenced at two hypervariable chromosomal loci, and we used a combination of greenhouse inoculation assays and PCR to test for presence of the symbiosis island in all the root surface isolates. We assigned haplotypes and symbiotic capacity information to all isolates and examined the frequency and spatial spread of epidemic rhizobial genotypes within and among host populations. Our goals were to (i) investigate genotype frequency and spatial spread of Bradyrhizobium in native A. strigosus hosts, (ii) infer the presence or absence of the symbiosis island in Bradyrhizobium, (iii) test for the role of symbiosis island acquisition as a driver of Bradyrhizobium strain dominance, and (iv) test for community structure of rhizobial isolates due to other abiotic or biotic factors.

Methods

Collection of Bradyrhizobium Isolates

Bradyrhizobium was isolated from the nodules and the soil root interface of A. strigosus, and clonal cultures were grown and archived for genotyping following published protocols [33]. Briefly, whole plants were excavated from native sites and transported in sealed plastic bags to the laboratory where they were washed to remove soil with tap water and sterilized tools were used to remove root nodules. Nodules were surface sterilized with bleach and rinsed with sterile water before being crushed with glass rods and the contents plated on a modified arabinose gluconate medium (MAG) [33]. For root surface isolates, the roots were dissected into ~1-cm sections, were divided into root tips and “old” roots, and were vortexed in a sterile solution of 0.01 % Tween 20 (Fisher Scientific Fair Lawn, NJ). The wash solution was then serially diluted and plated on glucose-based rhizobium-defined medium (GRDM) with cyclohexamide as an antifungal and bromothymol blue as a pH indicator [33]. Among the resultant colonies, we selected for Bradyrhizobium based on growth rate, color, and ability to grow on MAG and GRDM but not on Luria–Bertani (LB) medium [33]. Plant hosts for culturing were collected from 14 field collection sites across California covering an ~840-km transect. Collection sites included University of California Natural Reserves (Bodega Marine Reserve, Burns Piñon Ridge Reserve, and Motte Rimrock Reserve); an undeveloped site in the hills above University of California–Riverside, a biological field station in Claremont, CA (Robert J. Bernard Biological Field Station); natural preserves (Madrona Marsh Preserve, Pismo Dunes Natural Preserve, and Whitewater Preserve); a wildlife refuge (Guadalupe–Nipomo Dunes National Wildlife Refuge); two separate sites within a large state park (Anza Borrego Desert State Park); a municipal park (Griffith Park); a site adjacent to the San Dimas Reservoir; and an undeveloped site adjacent to human development (San Dimas Canyon) (Table S1). Nodule isolates were collected from plants at all sites, but root surface isolates were only collected from plants at the Bodega Marine Reserve, Motte Rimrock Reserve, and the undeveloped site in the hills above the University of California–Riverside.

Sequencing and Haplotype Analysis

Partial sequences from two chromosomal loci, glutamine synthetase (glnII) and recombinase (recA), totaling ~1 kb were PCR amplified and sequenced at the Institute for Integrative Genome Biology of UC Riverside using published protocols [34]. Only sequences with unambiguous bases were utilized leading to a total of 1292 sequenced isolates. Sequences from each locus were aligned separately using Clustal Omega, and isolates with identical sequences for each locus were determined using the “find redundant” command within the MacClade program [42]. Unique sequences, or haplotypes, were defined for each locus separately and for the concatenated dataset. Abundance was calculated for the concatenated dataset as the number of times each haplotype was isolated.

Symbiotic Capacity Assessment

We conducted a combination of assays on root surface isolates to test for symbiotic capacity. A subset of isolates (75) were previously assessed using greenhouse nodulation assays on A. strigosus, which has already been shown to be a permissive host on diverse Bradyrhizobium [24, 43, 44]. Here, we conducted greenhouse nodulation experiments on an additional 55 isolates, using identical procedures. Briefly, at least five seedlings per tested Bradyrhizobium isolate were grown in sterile conditions and were inoculated clonally with 5 × 108 cells, and parallel control hosts were inoculated with sterile water. At 8-week postinoculation, all hosts were unpotted, roots and shoots were weighed, and roots were checked for nodules. In all cases, controls lacked nodules. Hosts given the same inoculated strains either all became nodulated or were all lacking nodules. The remaining 342 root surface isolates were classified as symbiotic or non-symbiotic based on success or failure of PCR amplification of at least one symbiosis island locus (nifD, nodD-A, nodZ, and nolL) [22, 24, 45]. Earlier analyses showed that successful amplification of these loci, giving a band of the correct size, is a reliable indicator of presence of the symbiosis island [24]. Many isolates were tested at two or more loci (160/342; Table S1).

Phylogenetic Reconstruction and Species Designation

Phylogenies were reconstructed using the concatenated glnII and recA sequences from the cultured isolates and from the following reference strains: Bradyrhizobium arachidis (CCBAU33067), B. betae (PL7HG1), B. canariense (SEMIA928), B. cytisi (LMG25866), B. diazoefficiens (SEMIA5080), B. elkanii (USDA46), B. iriomotense (EK05), B. diazoefficiens (USDA110), B. lablabi (CCBAU61434), B. liaoningense (SEMIA5025), B. retamae (Ro19), and B. yuanmingense (R2m). Reference strains were chosen to represent all known species of Bradyrhizobium that aligned fully with our sequenced glnII and recA regions (National Center for Biotechnology Information (NCBI) as of November 18, 2014), and Mesorhizobium loti (MAFF303099) was used as an outgroup. All sequences were blasted against the reference strains to confirm their identity as Bradyrhizobium spp. The GTR model of evolution was selected from the Akaike information criterion in jModelTest2 [46], and the phylogenetic tree was reconstructed in PhyML 3.0 [47] utilizing a BioNJ starting tree and subtree pruning and regrafting (SPR). Branch support was estimated using the fast approximate likelihood ratio test (aLRT) and the Shimodaira–Hasegawa-like (SH-like) procedure [48]. To be consistent with other studies, Bradyrhizobium species were defined as the monophyletic clades including no more than one type strain with branch support ≥0.90 [49] and attempting to adhere to past species demarcations that utilized some of the same loci [35]. We analyzed inter-species variation using the ratio of fixed to shared polymorphisms using DnaSP [50].

Sequence Statistics

Using the concatenated dataset, we calculated strain richness (number of unique haplotypes/number of isolates) and strain dominance (abundance of each haplotype/number of isolates) analogues of species richness and evenness [51]. For each of the 14 field collection sites, haplotypes were defined as dominant if they were collected at least five times and represented at least 10 % of the total isolates at that site. Spatial spread was defined as the maximum distance between any individual isolates with the same haplotype. GPS coordinates for distances used the midpoint of each of the 14 field collection sites, because distances within sites were small compared to between site differences. We also calculated Hd (haplotype diversity—probability that two haplotypes drawn uniformly at random from the population are not the same), π (nucleotide diversity—average number of nucleotide differences per site between two sequences), k (average number of nucleotide differences), linkage disequilibrium (average absolute D′), recombination (R), and the minimum number of recombination events [5256] using DnaSP [50].

Trait Analysis

We tested for significant phylogenetic signal for the traits of symbiotic capacity, abundance, and spatial spread, which is a prerequisite for quantitative analysis of traits in a phylogenetic framework. We used Pagel’s lambda, estimated with the “fit discrete” function in the “Geiger” package [57], and for symbiotic capacity, we also used Fritz Purvis’ D, which was estimated using the “phylo.d” function in the “Caper” package [58]. The Mk1 model of maximum likelihood as well as parsimony were used for ancestral state reconstruction of symbiotic capacity with a modified phylogenetic tree in Mesquite [59]. Duplicate taxa were added to the phylogenetic tree whenever a single haplotype encompassed both symbiotic and non-symbiotic isolates to avoid the ambiguity of multiple character states being assigned to a single taxon (i.e., haplotype). We tested for correlated evolution between symbiotic capacity and haplotype abundance with the phy.anova command in the Geiger package in R [57]. We took several steps to avoid bias caused by sampling in this analysis. Firstly, we used the subset of isolates collected from plants where both root surface and nodule collections had been made (to avoid sampling bias). Secondly, among these plants, we equally sampled across the whole root system representing samples from nodules and from the soil-root interface with equal probability per sampled root. The resultant dataset included 442 root surface isolates and 116 nodule isolates from three field locales. We also used a standard ANOVA in JMP [60] to examine variation between symbiotic and non-symbiotic isolates in terms of abundance and spatial spread. This latter analysis does not take phylogenetic relationships into account and thus assumes that trait data are independent of strain relatedness.

Community Structure

We analyzed isolation by distance with a Mantel test correlating F st and physical distance matrices within PASSaGE [61]. We used Fast UniFrac [62] to test for significant differentiation among Bradyrhizobium communities at different collection sites. The “cluster sample” tool was used to cluster the collection sites based on the phylogenetic lineages they contained, and the “jackknife sample cluster” tool was used to assess confidence in the collection site clusters. We employed the “sample distance matrix” to numerically compare distances among collection sites. Abundance was incorporated into Fast UniFrac analyses whenever possible. The jackknife analysis was performed with the number of sequences kept equal to the smallest sample size with 100 permutations. We used the “exact test of population differentiation” in Arlequin [63] to assess differentiation among collection sites and to investigate other drivers of Bradyrhizobium community structure including root isolate types (nodule, root-tip surface, and old root surface), symbiotic statuses (symbiotic and non-symbiotic), and collection year.

Results

Haplotype Designation, Abundance, and Spatial Spread

The 1292 concatenated glnII and recA sequences resulted in 290 haplotypes, all of which were classified as Bradyrhizobium. Most of the haplotypes were unique (isolated a single time, 184/290; Table S1). Among the remaining haplotypes, 13 were defined as dominant in at least one site and these 13 haplotypes constituted the majority of collected isolates (706/1292). We found dominant haplotypes at all but the least sampled field collection site (Anza Borrego Desert State Park–Palm Canyon; Table 1). Most haplotypes (257/290) were only found at a single collection site. Conversely, among the dominant haplotypes, most were also found to be epidemic (7/13; collected at sites spanning ≥10 km). Spatial spread for epidemic haplotypes varied from ~100 to 750 km, and we collected epidemic haplotypes at all but the two least sampled sites (Anza Borrego Desert State Park–Palm Canyon and Pismo Dunes Natural Preserve) (Table 1). One epidemic haplotype (G03_R01) encompassed 27 % of all isolates collected (355 isolates) and was found at all but four collection sites (Pismo Dunes Natural Preserve, Griffith Park, and Anza Borrego State Park–Road/Palm Canyon sites; Fig. 1).

Table 1 Dominant and epidemic haplotypes
Fig. 1
figure 1

Map of California indicates collection sites with black dots. Pie charts connected to black dots illustrate the relative frequencies of five focal haplotypes. The five haplotypes chosen incorporate the four haplotypes with the highest abundance and the four haplotypes with the greatest spatial spread. The distribution of epidemic haplotype G03_R01 is hypothesized based on presence or absence of data from all collection sites. Starting from the upper left and moving clockwise, the collection sites are Bodega Marine Reserve, San Dimas Canyon, San Dimas Reservoir, Bernard Field Station, UC Riverside, Motte Rimrock Reserve, Whitewater, Burns Pinon Ridge, Anza Borrego Canyon, Anza Borrego Roadsize, Madrona Marsh, Griffith Park, Nipomo Dunes, and Pismo Dunes

Symbiotic Capacity Assessment

Using the combined dataset of greenhouse inoculation assays and PCR assays, we inferred 886 isolates to be symbiotic and 406 to be non-symbiotic (Table S1). For ~2 % of the 1292 isolates, there was at least one piece of conflicting information about symbiotic capacity, most often including nodule isolates that failed to amplify one or more of the four symbiosis island loci (17) but also including conflicting results between PCR amplification assays (9) (Table S1).

Most dominant haplotypes only encompassed isolates that were inferred to be symbiotic, but a single dominant haplotype (G64_R29) only had non-symbiotic isolates. Of the dominant haplotypes that included both symbiotic and non-symbiotic isolates, mean abundance was higher for symbiotic (36) versus non-symbiotic isolates (15.6) but the difference was not significant (t = 1.48, df = 4, p = 0.214). Among the epidemic haplotypes, most encompassed symbiotic and non-symbiotic isolates (5/7), with symbiotic isolates being more frequent on average than non-symbiotic ones, but without a significant difference (90.6 vs 18.4; t = 2.05, df = 4, p = 0.110).

Phylogenetic Analysis

The Bradyrhizobium phylogeny encompassed 20 deeply diverged, monophyletic lineages (clades) that are consistent with species-level divergence, including six that were previously identified (B. betae, B. canariense, B. cytisi, B. liaoningense, and B. retamae) and 14 clades that did not encompass one of the type strains (Figs. 2 and S1). Most (161) comparisons between these 20 clades uncovered more fixed than shared polymorphisms, also consistent with species designation. Two comparisons had the same number of fixed and shared polymorphisms, and 27 had more shared than fixed polymorphisms (Table S2). Almost half of these clades (8/20) were only collected at a single site. However, nearly all collection sites (13/14) were inhabited by multiple clades of Bradyrhizobium (Table S3). Bradyrhizobium canariense was particularly widespread and was collected at 11 sites. Population genetic statistics were analyzed for all Bradyrhizobium clades that encompassed multiple isolates (Table 2). Linkage was high between all SNPs for all clades (>0.9). Strain richness, Hd, π, and recombination varied widely between clades, likely because of variation in sampling. When only clades with over 40 isolates were assessed, Hd, π, and recombination were comparable.

Fig. 2
figure 2

PhyML 3.0 phylogenetic tree reconstructed from concatenated glutamine synthetase (glnII) and recombinase (recA) loci. Non-symbiotic taxa are indicated with red dots and symbiotic taxa with black dots. Inferred ancestral states of non-symbiotic and symbiotic are indicated by red and black branches, respectively, and ambiguous nodes are colored grey (estimated likelihood proportion was <66.66 %). The relative abundance of a haplotype and the spatial spread are indicated by stacked blue and green bars. Major clades are indicated with brackets. Reference strains can be identified by the lack of symbiotic capacity, abundance, and spatial spread data. The strains include (1) Mesorhizobium loti MAFF303099, (2) Bradyrhizobium retamae Ro19, (3) Bradyrhizobium elkanii USDA46, (4) Bradyrhizobium lablabi CCBAU61434, (5) B. diazoefficiens SEMIA5080, (6) B. diazoefficiens (japonicum) USDA110, (7) Bradyrhizobium betae PL7HG1, (8) Bradyrhizobium iriomotense EK05, (9) Bradyrhizobium arachidis CCBAU33067, (10) Bradyrhizobium yuanmingense R2m, (11) Bradyrhizobium liaoningense SEMIA5025, (12) Bradyrhizobium cytisi LMG25866, and (13) Bradyrhizobium canariense SEMIA928

Table 2 Species statistics for 19 Bradyrhizobium clades calculated in DnaSP

Ancestral state reconstructions were similar for parsimony and likelihood models (Figs. 2 and S2), and both infer gains and or losses of symbiotic capacity across multiple Bradyrhizobium clades (Table S4). Because many of the nodes on the phylogenetic tree are poorly resolved, it is impossible to estimate the number of gain or loss events with confidence. Considering only the 20 deeply diverged clades with likelihood support values >0.90, 10 of them encompassed both symbiotic and non-symbiotic isolates consistent with multiple gain and loss events (Fig. S1). Using both parsimony and a conservative maximum likelihood estimate (likelihood decision threshold of 2.0), we inferred about 1.5–2× as many loss events as gains (Table S4).

Symbiotic capacity exhibited significant phylogenetic signal, but abundance and spatial spread did not (Table S5). When we assessed symbiotic capacity and abundance for correlated evolution using chi-squared tests in JMP, we found a significant positive correlation; hence, that symbiotic clades on average exhibit higher abundance than non-symbiotic clades. When we tested for correlated evolution of these traits, we did not find a relationship (Tables S6 and S7); hence, that acquisition of a symbiosis island is not statistically associated with increase in haplotype abundance. Moreover, we did not find evidence for correlated evolution of symbiotic capacity with spatial spread using either the chi-squared tests in JMP or the phy.anova in Geiger.

Community Analyses

We found significantly different Bradyrhizobium communities among most collection sites (Table S8), among different isolate types (nodule, root-tip surface, and old root surface), symbiotic statuses (symbiotic and non-symbiotic), and collection years (Table S9). Non-symbiotic populations were significantly differentiated from isolates that were symbiotic within the same site at the Bodega Marine Reserve and Motte Rimrock reserve (Table S10). Two of four sites that were sampled multiple times exhibited population differentiation from year to year, San Dimas Canyon and Burns Piñon Ridge Reserve (Table S11). We did not find evidence for isolation by distance using the Mantel test. This is supported by the Fast UniFrac analyses in which clustering closely follows species makeup rather than geographical location (Figs. S3 and S4).

Discussion

Our study uncovered an epidemic distribution of Bradyrhizobium haplotypes across California with a striking divide between rare and dominant haplotypes. Although we recovered 290 haplotypes, the majority of isolates were encompassed within 13 dominant haplotypes (707/1292). Among the dominant haplotypes, six were endemic and seven were epidemic including haplotype G03_R01 that was dominant at most sites (10/14), exhibited a spatial spread of 750 km, and constituted nearly 30 % of the total isolates (355/1292) (Table 1). Several studies have uncovered rhizobial genotypes that achieve high frequency at localized sites [33, 51, 64], but haplotype G03_R01 dominates both locally and also across major ecological boundaries including mountain ranges and areas of inhospitable habitat. It is striking that haplotype G03_R01 was found in sites that vary markedly in their patterns of rainfall, temperature, plant community, and key soil nutrients [65, 66], especially given that most of the Bradyrhizobium lineages that we uncovered were relatively localized in their distribution. Recent work that focused on non-symbiotic lineages of Bradyrhizobium in soils has also found evidence of widespread genotypes, suggesting that this might be a common pattern for these bacteria [37].

We uncovered a broad diversity of Bradyrhizobium species nodulating and inhabiting the root surfaces of A. strigosus. The most recent efforts in defining symbiotic lineages of Bradyrhizobium propose 19 species [36]. Using a conservative species definition (monophyletic clades including no more than one type strain with branch support ≥0.90), we recovered 20 species including 14 lineages that were not represented in the NCBI database using these common loci (Fig. S1). No single method has been agreed upon for defining species of bacteria, but most definitions use phylogenetic clusters of multiple conserved loci, as we did here [67]. Despite the diversity that we uncovered, six out of the seven epidemic haplotypes are found within the B. canariense and B. novel I species (Table 2). Important differences can exist between rhizobial lineages including the diversity of host plants infected, metabolite utilization, and antibiotic resistance [6870], and these sources of variation may be driving differences in epidemic distributions among Bradyrhizobium taxa. We also found that Bradyrhizobium communities from different collection sites varied in species diversity (Table S3). We found the greatest diversity at the three sites where we collected root surface samples, consistent with previous work uncovering greater genotypic diversity among root surface isolates as opposed to nodule populations [33].

Genomic island acquisition can facilitate novel bacterial traits that confer fitness benefits in an environment specific manner [71]. However, we did not find support for the hypothesis that symbiosis island acquisition is an evolutionary driver of strain abundance or spatial spread in Bradyrhizobium (Table S7). Instead, we uncovered epidemics of only a few extremely abundant symbiotic haplotypes within B. canariense, and in this clade, we found evidence consistent with multiple events of symbiosis island gain and loss. Our dataset shows that both acquisition and loss of the symbiosis island appear to occur frequently across the sampled Bradyrhizobium populations. Both maximum likelihood and parsimony analyses suggest many more loss than gain transitions. Many types of mutations can result in the conversion of symbiotic strain to a non-symbiotic. But only the whole-scale horizontal transmission of the symbiosis island has been associated with gain of nodulation and nitrogen fixation in Bradyrhizobium. Given the more frequent loss events, some additional diversification must occur in the symbiotic lineages relative to the lineages without symbiosis islands.

The capacity to infect A. strigosus appears to have a significant affect on structuring Bradyrhizobium populations. We found support for differentiation in 2/3 comparisons of symbiotic and non-symbiotic communities (Table S10), and we also found that 8/9 comparisons of Bradyrhizobium communities collected from different plant parts exhibited differentiation (Table S9). We also found evidence for local differences among collection sites that could be responsible for Bradyrhizobium population differentiation; however, drift cannot be ruled out by these analyses (Table S8). Finally, we found evidence for temporal differentiation in 3/5 comparisons (Table S11). Taken together, these data are consistent with previous evidence showing that plant and soil factors affect population structure in the rhizosphere [72].

In summary, our analysis found that native Bradyrhizobium populations across California are dominated by a very small handful of haplotypes. We found that on average, symbiotic strains are more common than non-symbiotic ones and that symbiotic strains are also more likely to be dominant and to spread spatially. Nonetheless, we found evidence of both symbiotic and non-symbiotic strains achieving high abundance and spreading across great distances. Our ancestral state reconstruction inferred that the symbiosis island has been repeatedly gained and lost across the A. strigosus sampling range, and we did not find support for the hypothesis that acquisition of symbiosis islands serves as a driver of strain dominance or spread. We suggest that traits encoded on the Bradyrhizobium chromosome, that expresses traits important for soil survival [13, 14], are most likely responsible for variation in the capacity to spread epidemically.