Introduction

Soybean (Glycine max (L.) Merr.) is one of the most important leguminous crops grown and consumed worldwide due to its high protein (~ 40%) and oil (~ 18–19%) concentrations. Soybean establishes symbiotic associations with Bradyrhizobium japonicum (Strodtman and Emerich 2009) which reduces atmospheric N2 to ammonia and provides N to the plant. This association decreases the requirement for N fertilizers for soybean and other leguminous crops and improves soil fertility (Giller 2001; Jensen and Hauggaard-Nielsen 2003). In soils with little available soil N, symbiotic N2 fixation can meet up to 85–90% of the soybean N requirement (Mastrodomenico and Purcell 2012).

Symbiotic N2 fixation is sensitive to various abiotic stresses including drought, flooding, soil salinity, soil acidity, mineral deficiency or toxicity, and low/high temperature (Ramaekers et al. 2013). Water deficit conditions negatively impact N2 fixation in soybean by reducing nodulation and nitrogenase activity, which ultimately decreases soybean yield (Márquez-García et al. 2015; Serraj et al. 1999a). Also, it has been reported that N2 fixation is more sensitive to water deficit than photosynthesis under both controlled and field conditions (Adams et al. 2016; Djekoun and Planchon 1991; Durand et al. 1987; Kuo and Boersma 1971; Sinclair et al. 1987). Proposed mechanisms for decreased N2 fixation during water deficit conditions include carbon shortage, oxygen limitation, and feedback inhibition by products of N2 fixation (Purcell 2009; Serraj et al. 1999b).

Methods for quantifying N2 fixation include the N-difference method (Weaver and Danso 1994), acetylene reduction assay (ARA) (Hardy et al. 1968), 15N enrichment (Fried and Broeshart 1975; Fried and Middleboe 1977), 15N natural abundance (Shearer and Kohl 1986), and relative abundance of ureides (Unkovich et al. 2008). Each method has specific advantages over others, but easy, rapid, inexpensive, and quantitative methods for estimation of N2 fixation under both controlled and field conditions are still needed.

Among the various methods for estimating N2 fixation, 15N natural abundance (δ15N) is frequently used to quantify the fraction of N derived from the atmosphere (NDFA) in large scale field experiments and to serve as an index of N2 fixation (Andrews and Lea 2013; Barrie et al. 1995; Letolle 1980). This method compares the abundance of the 15N isotope in plant tissue, the atmosphere, and the soil environment with respect to the 14N isotope. The atmosphere has a lower concentration of 15N compared to the soil due to the N transformations in soil. The difference in 15N and 14N concentration between soil and atmosphere is expressed in terms of parts per thousand (‰) and is referred to as the N isotope ratio (δ15N) (Peoples et al. 1989). N2 fixation dilutes the 15N in plants actively fixing N2 as compared to plants that depend on mineral N as a N source (Doughton et al. 1995; Shearer and Kohl 1986). A low δ15N value is a favorable trait for selection because it indicates greater dilution of 15N by biological N2 fixation.

The percentage of NDFA from δ15N (Kohl and Shearer 1981) is calculated according to the equation below:

$$\% NDFA= \frac{{\delta }^{15}{N}_{ref}- {\delta }^{15}{N}_{samp}}{{\delta }^{15}{N}_{ref}- {\delta }^{15}{N}_{0}}*100$$

where δ15Nref is the composition of a plant totally dependent on soil N (non-nodulating genotype), δ15Nsamp is the composition of the individual samples, and δ15No (−1.30 for soybean, Bergersen et al. 1989) is the δ15N from a plant totally dependent on N2 fixation. The reference genotype in this equation reduces the error/noise caused by soil N variability in calculating % NDFA. However, the δ15N of the reference genotype is often relatively uniform across a field (Peoples et al. 2002), indicating that in the absence of a reference crop, δ15N can be used directly to estimate the amount of N fixed by genotypes via N2 fixation (Steketee et al. 2019).

The difference among genotypes for N2 fixation under normal and stress conditions may help identify genomic regions controlling N2 fixation under water deficit conditions. Quantitative trait loci (QTLs) mapping is the molecular approach used to understand the genetic architecture of many physiological and agronomical traits. Recent advances in high throughput genotyping and phenotyping platforms have revolutionized the dissection of the genetic basis of quantitative traits like N2 fixation and will accelerate development of soybean lines with enhanced N2 fixation.

Several studies have mapped QTLs for N2 fixation or related traits in soybean (www.soybase.org). Tanya et al. (2005) used a population of 136 F2-derived recombinant inbred lines (RILs) to identify a total of nine QTLs for nodule number per plant, nodule fresh and dry weight per plant, and acetylene reduction activity (ARA). Nicolás et al. (2006) identified two genomic regions associated with nodule number and nodule dry weight. Santos et al. (2013) studied the genetic control of nodule number and individual nodule weight and confirmed a QTL for nodule number identified previously by Nicolás et al. (2006). Hwang et al. (2014) were the first to map QTLs for nodule number, nodule size, and nodule weight in field experiments.

Dhanapal et al. (2015b) used association mapping on a diverse panel of 374 maturity group 4 accessions to identify QTLs for NDFA and N concentration. This analysis identified 17 and 19 SNPs significantly associated with NDFA and N concentration, respectively. Steketee et al. (2019) used association mapping for δ15N using a panel of 211 diverse soybean accessions and found 23 and 26 SNPs associated with δ15N and N concentration, respectively. To date, more than 70 QTLs for N2 fixation or traits directly or indirectly linked with N2 fixation have been mapped on all 20 chromosomes of soybean (www.soybase.org).

In the present study, a high-density genetic linkage map was constructed using 196 F6-derived RILs developed from PI 416997 × PI 567201D. The parents of this population were originally chosen because they were extremes for the ratio between 13C and 12C (Bazzer et al. 2020a), which serves as a surrogate measure of water use efficiency (Farquhar and Richards 1984). Although the parents were not selected for δ15N or N2 fixation, their RILs segregated for δ15N. Therefore, the main objectives of our study were to identify additive QTLs for δ15N, epistatic QTLs, and QTL × environment interactions. Further characterization of genes underlying the QTLs identified in this study will help to understand the biological mechanisms regulating N2 fixation in soybean and the genetic basis of N2 fixation.

Materials and methods

Development of RIL population

The cross between PI 416997 and PI 567201D was made at Stoneville, MS in 2011. The F1 generation was grown during the winter of 2011–2012 at the Tropical Agricultural Research Station at Isabela, Puerto Rico. The F2 generation was grown in Stoneville in 2012, where over 200 individual F2 plants were harvested without selection. Leaf tissue was harvested from each tagged F2 plant for DNA extraction and genotyping of the population. The F2:3 and F4:5 generations were grown in Homestead, FL during the winters of 2012–2013 and 2013–2014, respectively, harvesting one random plant from each single-plant-derived row in each nursery. The F3:4, F5:6, and F6:7 generations were grown in Stoneville, with the former two generations being advanced by single-plant descent in 2013 and 2014, respectively, and the latter generation being bulk harvested in 2015 to create bulked F6-derived lines for phenotyping.

Field trials

A mapping population consisting of 196 F6-derived RILs generated from a cross between PI 416997 × PI 567201D was used to identify the genomic regions associated with δ15N. The RIL population and parents were evaluated in four environments: at Stoneville, MS (33.42° N, 90.90° W) on a Bosket very fine sandy loam soil (fine-loamy, mixed, active, thermic, Mollic Hapludalfs) in 2016 and on a Dundee silty clay loam soil (Dundee fine-silty, mixed, active, thermic, Typic Endoaqualfs) in 2017, at the Milo J Shult Arkansas Agricultural Research Center, Fayetteville, AR (36.05° N, 94.15° W) on a Captina silt loam soil (fine-silty, siliceous, active, mesic Typic Fragiudult) in 2017, and at the Bradford Research Center near Columbia, MO (38.95° N, 92.33° W) on a Mexico silt loam soil (fine, smectitic, mesic Vertic Epiaqualf) in 2017. The combinations of locations and years were considered as individual environments and designated as ST16 (Stoneville in 2016), ST17 (Stoneville in 2017), FAY17 (Fayetteville in 2017), and CO17 (Columbia in 2017). Plantings occurred on 6 May 2016 at ST16 and 16 May 2017 at ST17 in one-row plots (0.66 m wide by 2.74 m long), 10 June 2017 at FAY17 in two rows plots (0.45 m wide by 6 m long), and 14 May 2017 at CO17 in single row plots (0.76 m wide by 3.05 m long). At each environment, the experimental design was a randomized complete block design with two replications. Experiments were irrigated as needed. Recommended practices were followed for insect and weed control.

Data collection

Shoot biomass of four random plants was sampled from each plot between beginning bloom (R1) and the full bloom (R2) stages (Fehr and Caviness 1977) on 29 June 2016 at ST16, 21 June 2017 at ST17, and 21 July 2017 at FAY17 and CO17. Biomass samples were dried at 60°C and coarse ground with a Wiley Mill (Thomas Model 4 Wiley® Mill, Thomas Scientific, NJ USA). Subsamples were finely-ground to pass a 1 mm sieve, and then ground to a fine powder with a Geno Grinder (SPEX CertiPrep, Inc., NJ USA) as described by Bazzer et al. (2020a). About 3–5 mg of the powdered sample was weighed into tin capsules, for δ15N isotope analysis which was conducted at the University of California-Davis Stable Isotope Facility (https://stableisotopefacility.ucdavis.edu/) using an elemental analyzer interfaced to a continuous flow isotope ratio mass spectrometer. Data from the stable isotope facility were expressed as δ15N in per mil (‰) and determined according to the equation below:

$${\delta }^{15}N=\frac{{R}_{sample}}{({R}_{air}-1)}*1000$$

where Rsample and Rair are the isotope ratios (15N/14N) of the sample and air, respectively.

For more information refer to the Stable Isotope facility website, http://stableisotopefacility.ucdavis.edu/13cand15n.html.

Statistical analysis

Descriptive statistics of δ15N for each environment and correlation coefficients between different environments for δ15N were calculated with SAS version 9.4 (SAS, Institute 2013). The difference between parents for δ15N in different environments was determined using a t-test. Analysis of variance (ANOVA) was conducted using SAS version 9.4 (SAS Institute 2013) with the PROC MIXED procedure (α = 0.05). Genotype and environment were considered as fixed effects and replication nested within environment was considered as a random effect (Bondari 2003). Heritability (h2, Holland et al. 2003) of δ15N for each environment and averaged across environments was computed using the PROC VARCOMP procedure of SAS 9.4 based on the following formula: \({\rm Across\,environments}: h^{2}=\frac{{\sigma }_{G}^{2}}{{\sigma }_{G}^{2} + \left(\frac{{\sigma }_{GE}^{2}}{e}\right)+ \left(\frac{{\sigma }_{e}^{2}}{re}\right)}\) \({\rm Within\,environments}: h^{2}=\frac{{\sigma }_{G}^{2}}{{\sigma }_{G}^{2} + \left(\frac{{\sigma }_{e}^{2}}{r}\right)}\) where \({\sigma }_{G}^{2}\), \({\sigma }_{GE}^{2}\), and \({\sigma }_{e}^{2}\) are the genotypic variance, genotypic × environment interaction variance, and residual error variance, respectively, e is the number of environments, and r is the number of replications within environment. This heritability should be considered a narrow sense estimate, as F6-derived RILs have a minimal level of heterozygosity within lines. Hence, most of the genotypic variance is composed of additive variance, with negligible variance due to dominance effects and its interaction with additive effects. The best linear unbiased prediction (BLUP) values for each individual environment and across environments were calculated using a mixed model to reduce environmental variance. All factors were considered as random effects in the case of individual environments. For calculation of BLUP values averaged across environments, environment was considered a fixed effect and genotypes and replications were considered as random effects (Littell et al. 1996; Piepho et al. 2008). QTL analysis was conducted using BLUP values for individual environments and across environments. The δ15N BLUP values of 196 RILs for all individual environment and across environments are provided in Supplementary File 1.

Selection of lines with extreme values for δ13C and δ15N

As described previously (Bazzer et al. 2020a), this population was also evaluated for carbon isotope ratio (δ13C) (a proxy measurement for water use efficiency, WUE) as the parents were different in their δ13C values based on phenotypic values and genomic estimated breeding values (GEBVs) (Dhanapal et al. 2015a; Kaler et al. 2017). The linear regression between δ13C and δ15N was performed using phenotypic values from each individual environment and averaged across environments. The biplots were divided into four quadrants using median values of δ13C and δ15N to select the lines that were extremes for both traits.

Genotyping-by-sequencing and construction of linkage map

Detailed information on genotyping and linkage map construction of this population were provided by Bazzer et al. (2020a), which are summarized below. The 196 RILs, together with their two parents, were sequenced and genotyped using genotype-by-sequencing (GBS). DNA was isolated from lyophilized leaf tissue of the 196 RILs and both parents, and GBS library construction was conducted at LGC Genomics GmbH (Berlin, Germany). The sequence reads were mapped to the ‘Williams 82’ soybean reference genome (assembly 1). Markers with more than 15% missing data, markers that were heterozygous, or did not follow a 1:1 segregation ratio pattern (chi-square P-value ≤ 0.01) were removed, resulting in a total of 3234 polymorphic markers. Missing marker data were imputed using a LD-kNNi method which was implemented in TASSEL software (https://www.maizegenetics.net/tassel).

The filtered and imputed 3234 polymorphic SNPs were used initially for construction of the linkage map. The MAP functional module of IciMapping software v4.1 (Meng et al. 2015) was used for the genetic linkage map construction using 2468 out of 3234 polymorphic markers after dropping redundant and low-quality markers. Genetic linkage groups, marker order and distances between markers were determined as described previously (Bazzer et al. 2020a). Linkage groups were numbered as soybean chromosomes according to the genomic position of SNPs on the reference genome. Finally using 2,466 SNPs, a map with 20 linkage groups was constructed that corresponded to the 20 soybean chromosomes. The SNP markers information along with their position (in base pairs and cM) on specific chromosomes are provided in Supplementary File 3 reported by Bazzer et al. (2020a).

QTL analysis

The BLUP values calculated for each individual environment and averaged across environments were used for QTL analysis. QTL mapping was performed using the QTL IciMapping v4.1 software (http://www.isbreeding.net/) through BIP and MET functional modules. A BIP module of inclusive composite interval mapping (ICIM) was used to detect the additive QTLs within and across environments. QTL × QTL interactions were identified by using the Inclusive Composite Interval Mapping of Epistatic QTL (ICIM-EPI) function. The LOD threshold to declare significant additive QTLs and interactions between QTLs was calculated using 1,000 iteration permutation tests with a genome-wide significance level of 0.05 (Doerge and Churchill 1996; Li et al. 2007). The mapping parameters were 1.0 cM walking speed with P-value inclusion threshold of 0.01 for ICIM-ADD and 5 cM walking speed with a P-value inclusion threshold of 0.0001 for ICIM-EPI. Multi-Environment Traits (MET) module was used for detection of QTL × environment interactions using δ15N BLUP data from all four environments. The missing phenotypic values were calculated by using the ‘mean replacement’ method. The specific parameters for detecting QTL × environment interactions were 1.0 cM walking speed and a probability of 0.01 in stepwise regression. Finally, the position of SNPs on different chromosomes and the position of identified QTLs on the genetic map were drawn using Prism software (https://www.graphpad.com/).

Identification of putative candidate genes

The search for putative candidate genes related to nodulation and N-metabolism underlying the genomic regions associated with δ15N identified in the present study was performed using the genome browser option (William 82 assembly 1) of Soybase (www.soybase.org). Genes between flanking markers and up to ± 1 MB outside of the confidence interval for flanking markers were considered as potential candidate genes. Additionally, the position of 54 soybean genes (28 nodulin + 24 regulatory genes) associated with nodulation or biological N2 fixation (Schmutz et al. 2010) were compared to the genomic regions of the δ15N QTLs identified in the present research.

Results

Phenotypic evaluation

The phenotypic values of parents and descriptive statistics of the RIL population are presented in Table 1. The parent PI 416997 had lower δ15N values than PI 567201D in all environments, but parents were not significantly (P < 0.05) different in any single environment. However, ANOVA across environments did indicate that δ15N for PI 416997 was significantly (P < 0.01) lower than PI 567201D (data not shown). There was wide segregation in the RIL population for δ15N (Fig. 1; Table 1) as indicated by δ15N ranges of 3.96‰ in ST16, 4.08‰ in ST17, 3.43‰ in FAY17, and 4.55‰ in CO17. The frequency distribution of δ15N was normal in all environments except CO17, as indicated by Shapiro-Wilk test (data not shown, Shapiro and Wilk 1965) and absolute values of skewness and kurtosis (less than 1.0, Table 1) . Transgressive segregants exceeding both parents were observed, which indicates that favorable alleles for δ15N were distributed between both parents. A significant positive correlation (P < 0.05) was found between ST16 and ST17 (r = 0.15), ST17 and FAY17 (r = 0.15), and ST17 and CO17 (r = 0.14) (data not shown).

Fig. 1
figure 1

Distribution of δ15N among recombinant inbred lines and parental genotypes at Stoneville, MS in 2016 (a), Stoneville, MS in 2017 (b), Fayetteville, AR in 2017 (c), and Columbia, MO in 2017 (d)

Table 1 Phenotypic variation for δ15N (‰) in the parents (PI 416997 and PI 567201D) and RIL population grown in four environments (Env.): Stoneville in 2016 (ST16), Stoneville in 2017 (ST17), Fayetteville in 2017 (FAY17), and Columbia in 2017 (CO17)

Analysis of variance (ANOVA) averaged across environments showed significant (P < 0.05) effects of genotype (G) and environment (E), whereas the interaction of genotype and environment (G × E) was not significant (Table 2). The narrow sense heritability of δ15N averaged across environments was 35%. Estimates of narrow sense heritability for δ15N within environments were 8% (ST16), 13% (ST17), 27% (FAY17), and 24% (CO17). Overall, low narrow sense heritability estimates across and within environments indicate that environmental effects play a major role in the expression of this trait and that phenotypic selection for this trait may only be successful using replicated trials of homogeneous lines across multiple environments.

Table 2 Analysis of variance (ANOVA) for δ15N in the RIL population along with parents evaluated in four environments (ST16, ST17, FAY17, and CO17)

The RIL population used in the present study was also evaluated for δ13C because the parents, PI 416997 and PI 567201D, also differed in WUE (Bazzer et al. 2020a). Although no significant correlation was found between these traits, regression analysis was performed between δ15N and δ13C phenotypic values for within and across environments to identify extremes among RILs for different combinations of δ15N and δ13C. The selected five RILs for high relative N2 fixation and high WUE in individual environments and averaged across environments are listed in Table 3. The lines RIL.14, RIL.25, RIL.75, and RIL.135 were identified in at least two individual environments and across environments as having a favorable combination (low δ15N and high δ13C). Similarly, lines with the unfavorable combination of high δ15N and low δ13C values were identified in individual environments and averaged across environments. The lines RIL.24, RIL.127, and RIL.161 were identified as having the unfavorable combination of high δ15N and low δ13C values in at least two individual environments, and RIL.112 had this unfavorable combination across environments in addition to individual environments.

Table 3 RILs with high δ13C and low δ15N phenotypic values or RILs with low δ13C and high δ15N phenotypic values in individual environments (ST16: Stoneville in 2016, ST17: Stoneville in 2017, FAY17: Fayetteville in 2017, and CO17: Columbia in 2017) and across environments (AE). Values in parentheses are the phenotypic values of δ13C and δ15N for that RIL

QTL analysis

The linkage map was 3,836 cM with individual linkage groups varying between 116 and 409 cM, based on the construction using 2,466 SNP markers (data not shown, Bazzer et al. 2020a). QTL analysis conducted using δ15N BLUP values from individual environments identified a total of 10 additive QTLs within environments (Fig. 2; Table 4), which were distributed on seven chromosomes (Gm01, Gm04, Gm07, Gm08, Gm10, Gm13, and Gm17). Of these QTLs, there were four QTLs in ST16, two QTLs in ST17, and four QTLs in FAY17. No QTLs were identified in CO17. The phenotypic variation explained by individual QTLs (denoted as R2), their additive effect, and parent contributing favorable alleles are presented in Table 4. These QTLs individually accounted for 1.72 to 9.34% of the phenotypic variation and had additive effects ranging from 0.02 to 0.13‰. The QTL present on Gm04 at 49,247,258 bp detected in ST17 had a high R2 value (9.34) as compared to other QTLs.

Fig. 2
figure 2

Physical position of SNPs on soybean chromosomes and position of loci (horizontal red bars) associated with δ15N identified by ICIM mapping for additive QTLs. The numbers in the black circles represent the loci number on a specific chromosome. Vertical colored bars (except blue) indicate the other QTLs found at the same positions in previous studies, and yellow circle indicate the nodulation genes

Table 4 Quantitative trait loci (QTLs) associated with δ15N detected in individual environments (ST16, ST17, FAY17, and CO17) and across environment (AE) in the RIL population of PI 416997 and PI 567201D using BIP functional module of ICIM mapping

QTL analysis by using the BLUP values averaged across environment (AE) by the ICIM-ADD mapping method identified eight QTLs (Fig. 2; Table 4). These QTLs were present on Gm01 (2), Gm04 (1), Gm07 (1), Gm13 (1), Gm14 (1), and Gm15 (2) and had additive effects that ranged from 0.04 to 0.12‰ that explained individually 1.83 to 14.39% of the phenotypic variation (Table 4). The QTLs on Gm01, Gm04, Gm07, and Gm13 appeared in both individual environments and across environments. The QTLs on remaining chromosomes were detected only in single environments or only across environments. An allele decreasing δ15N values was considered as the favorable allele, and the favorable allele for these QTLs was equally distributed between parents (PI 416997 and PI 567201D). When considering overlapping confidence intervals, there were 13 loci detected within and across environments (Fig. 2; Table 4).

QTL × environment and QTL × QTL interactions analysis

The interactions between QTL × environment and QTL × QTL play important roles in the genetic control of quantitative traits (Rebetzke et al. 2007; Reif et al. 2011). Eight QTLs present on Gm01 (2), Gm04, Gm07, Gm08 (2), Gm10, and Gm13 showed significant QTL × environment interactions as identified by MET functionality (Table 5). This interaction explained phenotypic variation that ranged from 1.17 to 28.25% and with additive effects from 0.01 to 0.03‰ (Table 5). Phenotypic variation due to additive × environment effects (PVE (A × E)) was greater than additive effects (PVE (A)) and the LOD score of additive effects (LOD (A)) was less than the LOD score for additive × environments effects (LOD (A × E)) for most of these QTLs, indicating that these QTLs had strong interaction with environments. The QTL on Gm01 (3,032,794 bp) had a greater LOD score and PVE for additive effect than additive × environment effect (Table 5), indicating the stability of this QTL across environments.

Table 5 QTLs showing QTL × environment interaction in four environments detected using MET functional module of ICIM mapping

QTL × QTL interactions were detected using the Epistatic QTL (ICIM-EPI) method of BIP functional module for δ15N values from individual environments. Epistatic interactions between different genomic regions were detected in ST16 and FAY17 and across environments (AE) (Table 6). The phenotypic variation explained by these interactions ranged from 3.53 to 7.78%, with the LOD score of these interactions being greater or equal to 3.5. The QTLs involved in epistasis were not identified as additive QTLs. No epistasis was detected in ST17 or CO17.

Table 6 Epistatic QTLs identified for δ15N in the RIL population of PI 416997 × PI 567201D by the ICIM-EPI method of BIP functional module

Identification of putative candidate genes

Of 13 additive loci, five loci (loci 1, 3, 6, 9, and 12) fell in the genomic regions carrying published soybean nodulation genes reported by Schmutz et al. (2010) (Fig. 2) that are directly involved in nodulation through production of nodulin proteins, nodulation signaling proteins, and different regulatory proteins. For example, Glyma.01g03470 (Locus 1), Glyma.04g43090 (Locus 3), Glyma.08g05370 (Locus 6), Glyma.14g05690 (Locus 9), and Glyma.17g08110 (Locus 12) genes (Schmutz et al. 2010) are involved directly in the process of nodulation. The nodulation genes, Glyma.11g06740, Glyma.13g40400, and Glyma.15g05010 were in the genomic regions of epistatic QTLs present on Gm11, Gm13, and Gm15, respectively. The genes having function related to N metabolism or N2 fixation present between flanking markers and up to ± 1 MB outside of the confidence interval of flanking markers underlying δ15N loci are provided in Supplementary File 2.

Discussion

A prerequisite for genetic improvement of N2 fixation is adequate genetic variability for the trait, and understanding the genetic basis of this variability using a dense genetic map would be helpful for implementing the most appropriate strategies in a soybean breeding program. In this study, we investigated the variability in δ15N as a proxy for biological N2 fixation using a population of RILs. In previous studies, NDFA was used for estimation of N2 fixation in soybean and other legumes (Dhanapal et al. 2015b; Heilig et al. 2017; Ramaekers et al. 2013). In the present study, δ15N values were directly used as an estimate of N2 fixation, as a non-nodulating/reference genotype was not planted with the experimental material in order to calculate NDFA values. Steketee et al. (2019) also used δ15N values in mapping N2 fixation in a GWAS panel for soybean. Our results found a significant difference (P < 0.01) between the parents when combined over environments, with PI 416997 having lower δ15N values than PI 567201D. The low δ15N value of PI 416997 indicates that the proportion of N from N2 fixation was greater for PI 416997 compared to PI 567201D.

The RILs had a wide phenotypic range for δ15N in all environments (ranged 3.43 to 4.55‰) (Table 1), but the specific range of δ15N in soybean is not well defined (Dhanapal et al. 2015b; Ludidi et al. 2007). The presence of transgressive segregants indicates that selection of lines for both low and high δ15N (along with low and high δ13C) values would be possible. Biplot analysis of δ15N and δ13C identified five RILs with favorable (low δ15N and high δ13C) and unfavorable (high δ15N and low δ13C) phenotypic combinations for δ13C and δ15N (Table 3). RIL.25 and RIL.135 were among the five top RILs (high WUE and high N2 fixation) in three of the four environments as well as across environments. Comparison of lines with contrasting δ15N and δ13C signatures may be important in characterizing the physiological and interactions between of N2 fixation and WUE.

Narrow sense heritability of δ15N across environments was 35% and for individual environments ranged from 8 to 27% in this study. Steketee et al. (2019) reported low broad sense heritability of δ15N (H = 17%) in an association study conducted using a diverse panel of soybean accessions. Similarly, Dhanapal et al. (2015b) found that broad sense heritability of NDFA was low (H = 21%) in a GWAS panel. In previous research, several physiological traits linked with N2 fixation were used for studying the genetic basis of N2 fixation (Hwang et al. 2013; Ray et al. 2015; Santos et al. 2013; Vieira et al. 2006). Narrow sense heritability was 18% (Vieira et al. 2006) and 33% (Santos et al. 2013) for nodule number, and 27% for weight per nodule (Santos et al. 2013). Broad sense heritability ranged from 33% (Ray et al. 2015) to 73% (Hwang et al. 2013) for shoot ureides and 59% for shoot N concentration (Hwang et al. 2013). Therefore, heritability of traits related to N2 fixation generally appear to be low to moderate, which is consistent with strong influence of environmental conditions on N2 fixation (Mastrodomenico and Purcell 2012; Ramaekers et al. 2013; Serraj et al. 1999a; Sinclair et al. 1987).

BLUP values of δ15N were used in the QTL analysis as it increases the accuracy of detection of QTLs by reducing the impact of environment. QTL analysis for δ15N by individual environment identified 10 QTLs present on seven chromosomes (Fig. 2; Table 4). No common QTLs were detected in two or more environments, but five QTLs were detected in specific environments that were also found across environments (Table 4). In addition to identified additive QTLs by individual environment analysis, one QTL on Gm14 and two QTLs on Gm15 were detected in QTL analysis across environments (Table 4). When considering QTLs detected in individual environments and across environments and overlapping confidence intervals, a total of 13 loci (Fig. 2; Table 4) were identified.

Of the eight QTLs that had significant QTL × environment interaction, six were also detected in individual environment analysis (Table 5). Although we did not find any QTLs common among individual environments, the detection of additive QTLs in joint-environment analysis by MET functionality increases the confidence of detection of these QTLs. The low phenotypic variation explained by the additive effect (PVE (A)) compared with the additive × environment effect (PVE (A × E)) indicates a large effect of environment for all QTLs except for the QTL on Gm01 (3,032,794 bp) (Table 5). The phenotypic variation explained by these additive QTLs is small (R2 < 10%), which indicates the complex nature of biological N2 fixation (Santos et al. 2013).

In this study, eight epistatic interactions explained 41% of the phenotypic variation present on different chromosomes (Table 6). Further, these epistatic QTLs were not identified as additive QTLs (Table 4). An epistatic QTL present on Gm05 was detected in both ST16 and FAY17, but this QTL interacted with different epistatic QTLs in ST16 (Gm06) and FAY17 (Gm02). Also, an epistatic QTL present on Gm03 that was detected in both FAY17 and AE interacted with the QTLs present on Gm15 (FAY17) and Gm14 (AE) (Table 6). Our results indicate that additive QTLs, QTL × environment interactions, and epistasis were important factors influencing the variations in δ15N in soybean.

The presence of QTLs associated with N2 fixation and other N-related physiological traits in the genomic regions of identified δ15N loci were screened in Soybase (www.soybase.org). Loci 3, 7, and 12 co-localized with previously identified QTLs for nodule-related traits such as nodule size (Hwang et al. 2013), nodule number (Shi et al. 2018), and nodule weight (Hwang et al. 2013; Shi et al. 2018). Locus 4 and Locus 10 coincided with ureide QTLs (Ray et al. 2015), and Locus 1 with a shoot N QTL (Dhanapal et al. 2015b) identified in a GWAS panel. Also, Loci 1, 7, and 12 co-localized with QTLs for δ15N identified in an association study (Steketee et al. 2019). Among these loci, Loci 1, 3, and 12 were located in candidate genes involved in the nodulation process. Gene N36 (Glyma.01g03470), underlying Locus 1, is an early nodulin gene involved in initiation of nodule development (Kouchi and Hata 1993). This gene also plays an important role in translocation of photosynthate into nodule tissue. Glyma.04g43090, underlying Locus 3, encodes for a nodulation signaling protein (NSP2), which is a Nod-factor activated transcriptional factor required for nodulation initiation (Murakami et al. 2006). Similarly, the N315 gene (Glyma.17g08110, underlying Locus 12) is expressed at the time of nodule emergence and plays a unique role in nodule formation (Kouchi and Hata 1993).

The δ15N QTLs at Loci 4 and 12 also overlapped with the genomic regions associated with δ13C identified in the same population (Bazzer et al. 2020a). Additionally, Loci 1, 3, 8, and 12 coincided with δ13C QTLs identified in GWAS mapping (Dhanapal et al. 2015a; Kaler et al. 2017). It is tempting to speculate that the greater δ13C may indirectly lead to increased N2 fixation under drought due to a greater supply of carbohydrates to nodules. Both water and carbohydrates are supplied to nodules through the phloem (Walsh et al. 1998), and allocation of greater amounts of carbon to nodules during drought is associated with prolonged N2 fixation (Purcell et al. 1998; King and Purcell 2001). Likewise, among the 13 additive loci associated with δ15N in the present research, five loci overlap with QTLs previously reported for δ13C (Fig. 2). Although there was no significant correlation between δ15N and δ13C (Bazzer et al. 2020a) among RILs in the present research, both Loci 4 and 12 were associated with δ15N and δ13C, but the favorable alleles for δ15N were derived from PI 576201D and for δ13C from PI 416997. Clearly, further research is needed to resolve the relationship between δ15N and δ13C in improving drought tolerance.

Similarly, epistatic QTLs (except QTLs present on Gm02) coincided with QTLs reported for nodule related traits (Hwang et al. 2013; Nicolás et al. 2006; Santos et al. 2013; Shi et al. 2018), ureide concentration (Ray et al. 2015), NDFA (Dhanapal et al. 2015b), and δ15N (Stekettee et al. 2019). Epistatic QTLs on Gm11, Gm13, and Gm15 were in the genomic regions carrying nodulation genes Glyma.11g06740, Glyma.13g40400, and Glyma.15g05010, respectively (Schmutz et al. 2010). Epistatic QTLs present on Gm06, Gm11, Gm15, and Gm 17 coincided with additive QTLs for δ13C (Bazzer et al. 2020a). In addition, epistatic QTLs also co-localized with δ13C QTLs found in various studies (Bazzer et al. 2020b; Dhanapal et al. 2015a; Kaler et al. 2017).

The co-localization of identified additive and epistatic δ15N QTLs with N2 fixation and WUE related traits supports the evidence that genetic links exist between these traits in soybean. To our knowledge, this is the first study on QTL analysis for δ15N using a biparental population in soybean. Although many of the QTLs associated with δ15N were environmentally specific, most of these QTLs co-localized with the QTLs associated with N2 fixation and other N-related physiological traits (e.g., nodule size, nodule number, nodule weight, NDFA, ureide, and shoot N concentration) which increases the reliability of these δ15N QTLs/loci for breeding purposes. Similarly, five of the 13 loci identified in this study as being associated with δ15N are known locations for soybean nodulation genes. These QTLs/loci provide tools for identifying genotypes with favorable alleles for N2 fixation that can be used in breeding programs to develop germplasm and understand the expression of N2 fixation genes.

Conclusions

In the present study, a mapping population of 196 F6-derived RILs was evaluated in multiple environments to understand the genetic basis of δ15N. A wide range of δ15N values were observed in all environments and narrow sense heritability of δ15N was low, indicating significant effects of environment on δ15N. Transgressive segregants for δ15N were observed among the RILs, indicating that it is possible to create, from the specific parents used, extreme genotypes with lower and higher δ15N values than observed in either parent. Both parents contributed to the higher and lower values observed. The extreme genotypes created in this population may be useful in future studies to better assess the physiological mechanisms of N2 fixation. QTL analysis by environment identified 10 additive QTLs present on seven chromosomes that individually explained less than 10% of the observed phenotypic variation. Considering QTLs identified across environments, along with individual environments, there were 13 loci for δ15N based on their overlapping confidence intervals. A lack of consistency of QTL detection was found as QTLs identified in an individual environment did not overlap with QTLs in any of the other environments. Co-localization of δ15N QTLs with QTLs for important agronomic and physiological traits related to N2 fixation, and the presence of reported nodulation genes associated with these QTLs, increases the likelihood that the newly identified regions are associated with N2 fixation. Further studies are needed for fine mapping these QTLs to understand their expression and to determine how they interact with putative candidate genes.