Introduction

Phytophthora root and stem rot (PRSR) caused by the oomycete pathogen Phytophthora sojae, is a prevalent soybean disease in most soybean growing regions throughout the world. It was estimated that PRSR caused an annual yield loss of approximately 44.7 million bu from 1996 to 2009 in the United States (Wrather and Koenning 2006, 2009). In certain year with heavy rainfall and poor drainage, the yield losses could reach 100 % in the affected fields. In the past decades, soybean yield losses to P. sojae have been limited by incorporating Rps genes conferring resistance to prevalent races of the pathogen into elite cultivars. Today, deploying resistant cultivars remains the most effective, economical, and environmental friendly approach to managing this disease.

So far a total of 20 Rps loci including 26 alleles have been identified, which are distributed on seven chromosomes (Hanson et al. 1988; Anderson and Buzzell 1992; Polzin et al. 1994; Demirbas et al. 2001; Gardner et al. 2001; Weng et al. 2001; Gao et al. 2005; Sandhu et al. 2004, 2005; Gordon et al. 2006; Sugimoto et al. 2008; Fan et al. 2009; Yao et al. 2010; Wu et al. 2011; Sun et al. 2011; Lin et al. 2013; Zhang et al. 2013; Ping et al. 2016). The Rps1 locus including five alleles (Rps1-a, Rps1-b, Rps1-c, Rps1-d, and Rps1-k), Rps7, Rps9, RpsYu25, RpsYD29, and RpsUN1 are located on the short arm of chromosome 3. Rps2 and RpsUN2 are located on the long arm of chromosome 16. The Rps3 locus including three “so-called” alleles (Rps3-a, Rps3-b, and Rps3-c), which is either linked to or allelic with Rps8, is located on chromosome 13. Rps4, Rps5, and Rps6 are linked and located on chromosome 18. Rps11 is located on the short arm of chromosome 7. Because many of these Rps loci were genetically anchored to linkage groups (LGs) or chromosomes using different types/sets of molecular markers and mapping populations of relatively small sizes, the order of most Rps loci on a same chromosome and their relative distances remain blurry. In addition, the multiple alleles at the Rps1 or Rps3 locus were roughly defined by such markers and populations without further fine-scale mapping using same sets of molecular markers, as such, whether they are really allelic with each other or they actually belong to different loci is also unclear. Fine mapping of these genes/alleles would enable more accurate definition of these genes/alleles and marker-assisted selection for breeding new cultivars resistant to P. sojae.

Among the race-specific Rps genes/alleles that have been identified, Rps1-k, Rps1-c, and Rps3-a are the primary ones deployed for soybean protection in the past few decades. In particular, Rps1-k has been isolated by a map-based cloning approach, and widely used for developing resistant cultivars (Kasuga et al. 1997; Gao et al. 2005). Intriguingly, the two Rps1-k candidate genes, both of which encode nucleotide-binding site-leucine-rich repeat (NBS-LRR) proteins, identified by sequencing three overlapping Bacterial Artificial Chromosome (BAC) clones from Williams 82, could not be found in the Williams 82 soybean reference genome sequence. As a result, functional markers that would most effectively target the causative mutations/genetic variations underlying resistance to P. sojae have not been developed in any of the previously identified Rps genes.

In general, the resistance contributed by individual Rps genes is non-durable. Such genes each, if used alone, would be effective for only 8-15 years due to rapid variation of the pathogen under selection pressure (Schmitthenner 1985). It is thus not surprising that Rps1-k, the most widely deployed Rps genes for breeding commercial soybean cultivars has lost its effectiveness to many emerging isolates of the pathogen (Sugimoto et al. 2012). A promising strategy for breeding more durable resistance is to pyramid multiple, broad-spectrum Rps genes into a single cultivar. However, several Rps genes/alleles are distributed in the same or adjacent genomic regions, and may not be simultaneously tagged unless they are fine mapped or cloned. When multiple Rps gene donors are involved in breeding, defining haplotypes of genomic regions surrounding the target Rps loci would also be essential for design of effective markers to distinguish individual target regions harboring different Rps genes.

Recently, we have identified two Rps genes, RpsUN1 and RpsUN2 from a soybean landrace, which together confer complete resistance to all 16 P. sojae races/isolates, including the prevalent ones identified in Indiana State used in resistance evaluation (Lin et al. 2013). RpsUN1 was defined to a 6.6 cM region between SSR markers Satt159 and BARCSOYSSR_03_0250 that spans the Rps1 locus on chromosome 3, corresponding to 1387 kb of genomic region in the soybean reference genome. RpsUN2 was defined to a 3.0 cM region between BARCSOYSSR_16_1275 and Sat_144 that is closely linked to the Rps2 locus on chromosome 16, corresponding to 423 kb of genomic region in the reference genome. These genomic regions represent two major NBS-LRR (R) gene clusters, which account for approximately a quarter of all R gene models predicted in the reference genome (Schmutz et al. 2010). It is documented that the R gene clusters increase the probability of structural and copy number variation of R genes by equal or unequal chromosomal recombination events, resulting in acquisition or loss of resistance, suggesting that these regions may be hotspots for novel Rps genes. Indeed, marker-assisted resistance spectrum analysis suggested that both RpsUN1 and RpsUN2 are potentially novel Rps genes/alleles, representing a new source of resistance for enhancing the durability and level of resistance to P. sojae. However, because different Rps genes in the two R gene clusters are closely linked, selection and pyramiding of individual genes from different resources into a single cultivar would be ineffective with markers distant from the target genes. To implement effective and precise selection for this new source of Rps genes for deployment in soybean breeding programs, we have conducted fine mapping of RpsUN1 and RpsUN2, identification of candidates for these two genes, and haplotype analysis of genomic regions surrounding the two genes.

Materials and methods

Plant materials

The mapping populations generated by reciprocal crosses between Williams and PI 567139B, which include F2:3 families from 44 survived F2 seedlings derived from the “Williams × PI 567139B” cross (dubbed the ‘WPT’ families), which were transferred to larger pots after inoculation to produce seeds. The additional 403 F2:3 families (dubbed the “WP” families) derived from the “Williams × PI 567139B” cross as described previously (Lin et al. 2013), and 379 F2:3 families derived from the cross of “PI 567139B × Williams” (dubbed the ‘PW’ families), making a total of 826 F2:3 families for fine mapping of the two loci.

Inoculation treatment and disease evaluation

Phytophthora sojae, pmg(17)-1, and pmg(25)-1, with pathotypes corresponding to races 17 and 25, respectively, were used to evaluate the resistance of the F2:3 families using a protocol previously described (Dorrance et al. 2008; Lin et al. 2013).

Sample collection and DNA extraction of mapping population

Approximately 15-20 progeny F2:3 seedling leaf samples from each of the F2 recombinants determined by molecular markers surrounding RpsUN1 and RpsUN2, respectively, were equally mixed for DNA isolation. Genotyping of detected F2 recombinants with SSR (Song et al. 2010), CAPS, and SNP markers were conducted following methods described previously (Lin et al. 2013; Ping et al. 2016) using primers and enzymes listed in Online Resource 1.

Evaluation of gene expression by RNA-seq

Approximately 20 seedlings of PI 567139B were inoculated with P. sojae race 1 (pmg(1)-3) and 20 seedlings from the same line were wounded without inoculation. After 24 h, stems of 2–3 cm in length across the wounded sites from the inoculated and wounded seedling were sectioned to form the inoculated and wounded groups, respectively. RNA isolation, RNA-seq, and data analysis were performed following a protocol described earlier. RNA-seq data from Williams and 10 Rps gene isogenic lines (Rps1-a, Rps1-b, Rps1-c, Rps1-k, Rps3-a, Rps3-b, Rps3-c, Rps4, Rps5, and Rps6) under the same treatment as performed for PI 567139B (Lin et al. 2014) were analyzed using the same protocol as used for PI 567139B. The expression values (FPKM, fragments per kb of exon per million fragments mapped) of the genes in the mapped RpsUN1 and RpsUN2 regions were measured and compared by Cufflinks (Trapnell et al. 2012). The RNA-seq data obtained from PI 567139B in this study have been deposited at the National Center for Biotechnology Information Gene Expression Omnibus under accession number GSE82240.

Phylogenetic analysis

The phylogenetic neighbor-joining trees were constructed using SNPs extracted from the two defined genomic regions among all the USDA soybean germplasm accessions genotyped with the SoySNP50K iSelect BeadChip containing 52,041 SNPs (Song et al. 2013, 2015) following the methods previously described (Tian et al. 2010).

Results

Fine mapping of RpsUN1 on chromosome 3

To further narrow the region harboring RpsUN1, SSR markers near or at the boundaries of previously mapped RpsUN1 region, BARCSOYSSR_03_180 and BARCSOYSSR_03_250, were used to genotype the 826 F2 plants derived from the reciprocal crosses between Williams and PI 567139B, and nine recombinants between the two SSR markers were identified. Subsequently, the F2:3 families derived from these recombinants were inoculated with the P. sojae race 17 (Online Resource 2), which is avirulent to RpsUN1 but virulent to RpsUN2. The corresponding F2 plants or pools of F3 seedlings from individual F2 plants were genotyped using additional seven SSR markers exhibiting polymorphisms between the two parental lines. As showN in Fig. 1, the recombinants WP133 and PW202 defined the RpsUN1 locus to the downstream of the marker BARCSOYSSR_03_0233, while the recombinants W243 and A318 defined the RpsUN1 locus to the upstream of the marker BARCSOYSSR_03_0246. The genotypic and phenotypic data from the remaining five recombinants were as expected and consistent to the data from the four recombinants WP133, PW202, WP243, and PW318. Thus, the candidate gene for RpsUN1 was defined to the region between BARCSOYSSR_03_0233 and BARCSOYSSR_03_0246. According to the Williams 82 soybean reference genome, these two markers span an ~151 kb region, which contains five annotated genes, including three predicted R-like genes (Online Resource 3).

Fig. 1
figure 1

Fine mapping of the RpsUN1 locus. a Linkage map of the RpsUN1 locus. b Physical positions of molecular markers (according to the Williams 82 reference genome) used for fine mapping of the RpsUN1 locus. c Recombinants carrying crossovers as determined by molecular markers and genotypes of the F2 plants deduced by phenotyping the F2:3 families. Black, white, and gray bars represent homozygous segments from PI 567139B, Williams 82, and heterozygous segments from both parental lines as determined by molecular markers, respectively. d Annotated genes in the fine mapped region of the Williams 82 reference genome. Black boxes indicate disease resistance-related genes, whereas gray boxes indicate other types of genes. Arrows indicate the transcriptional orientation of these genes

Fine mapping of RpsUN2 on chromosome 16

To further narrow the region harboring RpsUN2, SSR markers near or at the boundaries of previously mapped RpsUN2 region, BARCSOYSSR_16_1275 and Sat_114, were used to genotype the 826 F2 plants, and 23 recombinants between these two SSR markers were identified. Subsequently, the F2:3 families derived from these recombinants were phenotyped by inoculation with the P. sojae race 25 (Online Resource 2), which is avirulent to RpsUN2 but virulent to RpsUN1. The corresponding F2 plants or pools of F3 seedlings from individual F2 plants were genotyped using additional two SSR markers showing polymorphisms between the two parental lines and four cleaved amplified polymorphic sequence (CAPS) markers designed based on genic sequences in the mapped region from the two parental lines. As shown in Fig. 2b, the recombinants WPT074, WP401133, and PW555 defined the RpsUN2 locus to the downstream of the marker CAPS3, while the recombinants WP015, WP062, WP364, PW568, and PW547 defined the RpsUN2 to the upstream of the marker CAPS4. The genotypic and phenotypic data from the remaining 15 recombinants were as expected and supportive of the data from these eight recombinants. Additional four CAPS markers and two SNP markers detected by direct sequencing PCR fragments were designed based on sequences between CAPS3 and CAPS4 from the two parental lines and used to genotype the eight recombinants defined by CAPS3 and CPAS 4. Based on the genotypic and phenotypic data from the six recombinants WP062, WP364, WP404, PW568, PW547, and PW555, the candidate for the RpsUN2 locus was defined to the genomic region between CAPS6 and SNP2. According to the soybean Williams 82 reference genome (Schmutz et al. 2010), this region contains four genes, all of which are of the typical R type (Online Resource 3). Unexpectedly, the genotypic data and phenotypic data from two recombinants AP015 and WPT074 were inconsistent. The former suggests that the RpsUN2 locus is located in the upstream of CAPS6, while the latter suggests that the RpsUN2 locus is located in the downstream of CAPS8. Additional replicates were performed and the phenotypic and genotypic data remain the same.

Fig. 2
figure 2

Fine mapping of the RpsUN2 locus. a Linkage map of the RpsUN2 locus. b Physical positions of molecular markers (according to the Williams 82 reference genome) used for fine mapping of the RpsUN2 locus. c Recombinants carrying crossovers as determined by molecular markers and genotypes of the F2 plants deduced by phenotyping the F2:3 families. Black, white, and gray bars represent homozygous segments from PI 567139B, Williams 82, and heterozygous segments from both parental lines as determined by molecular markers, respectively. d Annotated genes in the fine mapped region of the Williams 82 reference genome. Black boxes indicate disease resistance-related genes, whereas gray boxes indicate other types of genes. Arrows indicate the transcriptional orientations of these genes

Expressional changes of genes in the fine mapped RpsUN1 and RpsUN2 regions in responsive to P. sojae

Because the majority of the annotated genes in the fine mapped RpsUN1 and RpsUN2 regions in the Williams 82 reference genome are of the R type such as NBS-LRRs, which are highly similar and often hotspots for unequal recombination that lead to chimeric structure and/or copy number variation (Michelmore and Meyers 1998; Hulbert 1998; Ellis et al. 2000; Nagy and Bennetzen 2008), we thought that further fine mapping of the two loci may not be effective towards pinpointing the candidate genes for RpsUN1 and RpsUN2. To understand the genomic difference in the RpsUN1 and RpsUN2 regions between the two parental lines and potential causative variation that defers the resistant parental line from the susceptible parental line, primers were designed based on the three NB-ARC genes in the RpsUN1 region and the four NBS-LRR genes in the RpsUN2 region of Williams 82 and used to amplify their genomic and transcriptomic counterparts in PI 567139B. However, in many attempts, these genes in PI 567139B are either too similar to be distinguishable or difficult to be amplified possibly due to sequence diversity between the parental lines. As a result, the genomic and transcriptomic counterparts of these NBS-LRR genes in PI 567139B have not been accurately determined.

In an attempt to pinpoint the candidate genes for the RpsUN1 and RpsUN2 loci, we evaluated and compared the expression levels of the five genes in the RpsUN1 region and the four genes in the RpsUN2 regions, before and after inoculation with P. sojae, in the two parental lines PI 567139B,Williams, and 10 Rps gene isogenic lines in the Williams background, by RNA-seq. To increase the accuracy of the evaluation, only the RNA-seq reads uniquely mapped to the reference genome were used to calculate the relative abundance of transcripts from each gene. The changes in relative abundance of transcripts from the nine genes upon inoculation with the pathogen were shown in Fig. 3 and Online Resource 4. Among the five genes in the RpsUN1 region, Glyma.03g034400, Glyma.03g034600, and Glyma.03g034800 were all up-regulated in PI 567139B, but down-regulated in Williams, upon inoculation with the pathogen. Up-regulation of these three genes were also detected in a few other Rps gene isogenic lines, but only Glyma.03g034400 showed highest levels of up-regulation among the 12 lines examined. Thus, Glyma.03g034400 may be the candidate for the RpsUN1 locus. Among the four genes in the RpsUN2 region, Glyma.16g214900 showed up-regulation in PI 567139B but down-regulation in Williams and all the 10 Rps gene isogenic lines. Glyma.16g215000, Glyma.16g215100, and Glyma.16g215200 were also up-regulated upon inoculation with the pathogen, but such up-regulation was also detected in the majority of the Rps gene isogenic lines. Therefore, Glyma.16g214900 would be the best candidate for the RpsUN2 locus.

Fig. 3
figure 3

Expressional changes of genes in the RpsUN1 and RpsUN2 regions revealed by RNA-seq. The abundance of each gene relative to the abundance of Cos4 was used to evaluate the relative expression level of the gene. No any RNA-seq reads uniquely mapped to Glyma.03g034500 or Glyma.03g034700 were identified in any of the 12 lines

Haplotype variation of the RpsUN1 and RpsUN2 regions in the entire USDA soybean germplasm collection

Although RpsUN1 and RpsUN2 were fine mapped to two considerably small genomic regions, whether these two loci different from any other Rps loci in the overlapping/adjacent regions was unclear. This is mainly due to the fact that different Rps genes were mapped at different scales and with different sets/types of molecular markers, making the genetic maps of these Rps genes incomparable. To shed light on this question, we extracted the SNP genotypic data from a 560 kb genomic region surrounding the RpsUN1 locus and a 110 kb genomic region surrounding the RpsUN2 locus from the USDA Soybean Germplasm Collection that is comprised of 19,652 Glycine max and Glycine soja accessions. The SNP data were generated using the SoySNP50K iSelect BeadChip, and each region contains 10 SNPs. Using these SNP data, we analyzed the phylogenetic population structure of the two regions. As shown in Fig. 4, the branch for the RpsUN1 region of PI 567139B is considerably distinct from the branches for the corresponding regions from the donor or ancestral lines of Rps1-a, Rps1-b, Rps1-c, Rps1-d, and Rps1-k, and the branch containing the RpsUN2 region of PI 567139B is considerably distinct from the branch containing the corresponding region from an ancestral line carrying Rps2.

Fig. 4
figure 4

Haplotype variation of the RpsUN1 and RpsUN2 regions among the entire USDA Soybean Germplasm Collection as revealed by SNPs distributed in the two regions. a Phylogenetic tree constructed with SNPs in the RpsUN1 region. b Phylogenetic tree constructed with SNPs in the RpsUN2 region

Discussion

In this study, we fine mapped the RpsUN1 and RpsUN2 loci to two small genomic regions, which only harbor five and four genes, respectively, according to the soybean reference genome. In these regions, only two and three recombinants between the closest markers and the RpsUN1 and RpsUN2 loci were found in the mapping populations comprising 826 F2 plants and/or F2:3 families, thus recombination frequencies between these markers and the two resistance loci are extremely low (0.0024 and 0.0036, respectively). If these markers are used for marker-assisted selection of these two genes in breeding programs, the accuracy of selection for the two resistance loci would be higher than 0.4 %. The accuracy can be further increased if effective markers within the mapped regions are identified and used in marker-assisted selection.

Nevertheless, we would like to note again that two recombinants (WP-015 and WPT-074) which showed inconsistent phenotypic and genotypic data with some markers in the mapped genomic regions were found. These two recombinants controversially positioned the RpsUN2 locus to the upstream and downstream of the fine mapped RpsUN2 region. As the RpsUN2 region in PI 567139B and these two recombinants have not been fully sequenced, it remains unknown how such an inconsistency was formed. However, given the fact that the fine mapped RpsUN2 region in Williams 82 contains only four NBS-LRR genes, which are flanked by additional NBS-LRR genes as part of a genomic region enriched with NBS-LRR genes (Schmutz et al. 2010), it is expected that the RpsUN2 region in PI 567139B is also enriched with NBS-LRR genes. It has been documented that NBS-LRR genes or gene clusters are generally inherently unstable and fast-evolving via recombination and rearrangements, such as duplication and deletions by unequal recombination (Michelmore and Meyers 1998; Hulbert. 1998; Ellis et al. 2000; Leister 2004; Nagy and Bennetzen 2008), resulting in new forms of NBS-LRR genes as well as their copy number variation, it thus is possible that the two recombinants showing inconsistent phenotypic and genotypic data are the outcome of unequal recombination mediated by NBS-LRR genes.

Although a single trait is generally controlled by a single gene, a genetic locus underlying a specific trait such as disease resistance could consist of multiple genes of the same or different types. Such genes could act independently from each other or function at variable levels according to their copy number variation. For example, the Rps1-k locus is composed of four tandemly arranged NBS-LRR genes, among which two individual copies were each fully responsible for complete resistance to P. sojae (Gao et al. 2005; Gao and Bhattacharyya 2008). By contrast, a recent study demonstrates that copy number variation of multiple genes at the Rhg1 locus mediates nematode resistance in soybean (Cook et al. 2012). Thus, additional genes in the mapped regions of PI 567139B, if any, which are absent in the corresponding regions of the reference genome, could also be involved in the control of resistance to P. sojae. If this is the case, the candidate genes for the RpsUN1 and RpsUN2 loci may have not been fully deduced based on the observed changes in relative abundance of the RNA-seq reads uniquely mapped to the two genomic regions of the reference genome.

Allelic test is routinely used to determine whether a trait is controlled by two different alleles of a single gene or by two different genes. However, due to the complexity of many resistance loci and the potential instability of such loci and underlying traits, it would be difficult or ineffective to determine exclusively whether RpsUN1 and RpsUN2 are different from other Rps genes mapped to similar regions by allelic test. Nevertheless, the haplotypic variation of the RpsUN1 and RpsUN2 regions between PI 567139B and the donor/ancestral lines carrying other Rps genes are obvious, which may reflect the distinct origins of the RpsUN1 and RpsUN2 loci and other known Rps genes/alleles mapped in their overlapping or adjacent regions.

Author contribution statement

JM & CC designed the research; LL, FL, WW, JP, JCF, MZ, SL performed the research; LL, FL, LS, CC, and JM analyzed the data; JM wrote the manuscript.