Introduction

Roses are cultivated worldwide for their beautiful flowers. Rose is the first ranked cut flower in world trade both in terms of quantity and value. Conventional taxonomy (Wissemann 2003) divides the genus Rosa (Family Rosaceae) into four subgenera, Hulthemia (Dumort.) Focke, Platyrhodon (Hurst) Rehder, Hesperhodos Cockerell and Rosa. Rosa, being the largest subgenus, includes about 95% of all species under this genus and is subdivided into 10 sections. The genus Rosa comprises about 120–200 species which are widely distributed in the northern hemisphere (Bruneau et al. 2007; Gudin 2000; Yü et al. 1985). Of the 120 species (most accepted) in the genus Rosa, only eight species viz. R. chinensis, R. damascena, R. foetida, R. gallica, R. gigantea, R. moschata, R. multiflora and R. wichuraiana are said to have played an important role in the development of modern garden roses (Gudin 2000).

India is one of the main centres of diversity with respect to Rosa species. Around 16 species and four hybrid species were found growing wild in different phyto-geographical zones of India (Rathore and Srivastava 1992). Out of that, six wild species (R. brunonii, R. moschata, R. multiflora, R. cathayensis R. macrophylla and R. webbiana) were found only in the Western Himalayan region; which performed well under sub-temperate climate and possess excellent vegetative and floral characters justifying their use as donor parents in rose improvement (Singh et al. 2017). In order to breed varieties suitable for local agro-ecological conditions, the use of native species is of paramount importance. These species can provide genes required for adaptability, tolerance to both biotic and abiotic stresses, fragrance, perpetual flowering, and hardiness.

Morphological traits such as plant growth habit, the shape and colour of prickles, leaflet shape and size, leaf serration and shape of stipules are the key features that are used to identify species at the taxonomic level but some of the accessions do not show typical characteristics of a particular species which may be due to out-crossing with other species and hence we need more tools to distinguish them. In an earlier study, we found type of stipule as the most important trait that distinguished 21 cultivated and 4 unknown wild Rosa species collected from Himalayan region of India (Gaurav et al. 2018). Different accessions of R. damascena did not show the specific characteristics of the species and same was the case for R. wichuriana group (Gaurav et al. 2018). Thus, morphological characters alone are not sufficient to study the genetic diversity or classification of a species like rose, which is an outcrossing species with complexity in ploidy levels.

In the 1990s, molecular markers were developed for rose cultivar identification and several of these such as RAPD, SSR and AFLP marker systems were used for identifying species relationships in Rosa. SSR markers have been developed in rose by many researchers for studying the diversity of rose varieties as well as species from across the world (Aparna et al. 2019; Babaei et al. 2007; Esselink et al. 2003; Hibrand-Saint Oyant et al. 2008; Kimura et al. 2006; Panwar et al. 2015a, b; Samiei et al. 2009; Scariot et al. 2006; Smulders et al. 2009; Zhang et al. 2006). However, very limited work has been done on diversity, conservation and utilization of rose species and varieties grown in India. Work related to morphological and molecular characterization was done by Kaul et al. (2009) in interspecific hybrids of R. damascena × R. bourboniana using AFLPs and RAPDs. Panwar et al. (2015a) used RAPDs and ISSRs for studying the Indian rose genotypes while Rai et al. (2015) used RAPDs in studying the wild species of rose. Singh et al. (2017) also used AFLP and RAPD markers to study the diversity in cultivated and introduced varieties and some wild species of rose. Still enough data is not available regarding different wild species of India. Hence, the present investigation was attempted on diversity analysis of rose species grown and available in India using SSR markers.

Materials and methods

Plant material

The experimental material for the current study consisted of 21 cultivated and four wild species (Rosa species 1-4) of the genus Rosa of family Rosaceae (Table 1). R. damascena was represented by three accessions and Rosa hybrida by two while the rest of the 19 species were represented by one accession each. The known cultivated 24 species which were collected from ICAR - National Bureau of Plant Genetic Resources (ICAR-NBPGR), Shimla, Himachal Pradesh and ICAR- IARI, regional station, Katrain, Himachal Pradesh were grown at ICAR - Indian Agricultural Research Institute (ICAR-IARI), New Delhi, for further studies. However, the four unknown wild species plants from Katrain could not be established at ICAR-IARI. The details of species along with accession number and source of collection are provided in Table 1.

Table 1 List of rose species used for genetic diversity analysis using SSR markers. *

DNA isolation and SSR analysis

All the molecular work was carried out in the laboratory of ICAR- National Institute for Plant Biotechnology (NIPB), New Delhi. DNA isolation was done according to Doyle and Doyle (1990) but with increased concentration of CTAB (2%), β-mercaptoethanol (2 μL/mL) and PVP (2%) in a high salt concentration extraction buffer. High-quality purified DNA, free from metabolites was obtained after treatment with RNAase. Each sample was then estimated both by spectrophotometer (Nanodrop 8000, Thermo Scientific, Waltham, MA, USA) and by ethidium bromide-stained band intensities against standard λ DNA.

A total of 56 SSR markers developed from different species of rose were selected from various research studies and used for identification and classification of 28 cultivated and wild species of rose. Thirty three SSR primer pairs with high polymorphism and reproducibility were selected from Hibrand-Saint Oyant et al. (2008). Eight and fifteen SSR primer pairs with higher polymorphism as identified by Kimura et al. (2006) and Zhang et al. (2006), respectively, were also used. Details of the 56 primers are given in supplementary Table 1. Diluted DNA (50 ng/µl) was used to amplify the SSR markers using a gradient thermal cycler (Applied Biosystems Inc. USA). The PCR reaction of 15 μl mixture containing genomic DNA, dNTPs (10 mM), 10X PCR Buffer (including MgCl2), forward and reverse primer (5 pmole each), DNA Taq polymerase enzyme (Takara) and nuclease-free water was prepared. The PCR reaction was set with an initial denaturation at 94 °C for 5 min, followed by 35 cycles of 94 °C for 30 sec for denaturation, 55  °C for 40 sec for annealing, 72 °C for 30 sec for extension and a final extension at 72 ºC for 10 min and 4 °C for storage. The amplified products were resolved in 3.5 % MetaPhorTM (Lonza Rockland Inc., Rockland, ME, USA) agarose gels (containing ethidium bromide (10 mg/ml)) in 0.5x TBE buffer at a constant voltage of 100V for 4.5 hours using a horizontal gel electrophoresis system (Biorad, USA). A 100bp DNA ladder (GeneDireX®, Biolink, India) was run alongside the amplified products to determine the approximate band size. The amplified fragments were visualized and photographed under UV light using gel documentation (Syngene, UK).

Data analysis

Though SSRs are known traditionally as locus-based markers, owing to their various ploidy levels in the Rosa genus, multiple PCR amplicons per primer pair were found to be amplified in many species. Hence, scoring of bands was unlike a co-dominant marker (i.e., not based on allele type specified by length polymorphism) but rather on the presence or absence of the amplicons. Thus, alleles for each SSR locus and the complete set of presence/absence scores for all alleles at each locus were considered as the ‘allelic phenotype’ of a genotype (Becher et al. 2000). In order to generate the genetic similarity index (SI), a binary matrix was generated by assigning score 1 for the presence and 0 for the absence of a given band for an allele. Similarly, a binary matrix was generated for all the alleles in a gel. Only clear and reproducible bands were considered for the analysis. In order to check the efficiency of SSR in diversity analysis, following parameters were considered for each marker: major allele frequency, number of polymorphic bands, number of monomorphic bands, fraction of polymorphic loci (β), polymorphism information content (PIC), expected and observed heterozygosity (He and Hp), marker index (MI), and resolving power (Rp) (Chesnokov and Artemyeva 2015; Shingote et al. 2019). Fraction of polymorphic loci (β) was calculated as a proportion of polymorphic amplicons to the number of amplicons produced for each primer pair (np/n). Polymorphism Information Content (PIC) was calculated as 1−∑pi2 where p is the frequency of the ith allele. Marker index (MI) was calculated as the product of expected heterozygosity, β and number of polymorphic bands. Resolving power (Rp) was calculated Σ Ib (Ib = 1− (2× (0.5- p)) wherein Ib refers to allele informativeness and p for allele frequency. Scored data were used to compute pair-wise Jaccard’s similarity coefficient (Jaccard 1908) using NTSYS-pc version 2.21s (Rohlf 1998) package. Jaccard’s similarity matrix was used for cluster analysis using the unweighted pair-group method with arithmetic averages (UPGMA) method in sequential, agglomerative, and hierarchical and nested (SAHN) clustering algorithm. The Principal Component Analysis (PCA) was done using ‘R’ software to identify multidimensional relationships among various morphological traits used for dividing the genotypes into various groups (Revelle 2017).

Results

Characterization of  Rosa species using SSR markers

Out of the 56 SSR markers (Supplementary Table 1) tested in the rose species, only 33 primers gave robust and reproducible amplification across all the species. The map position and linkage group information of 23 of the 33 primers are presented in supplementary Table 2 which shows that all the LGS were covered except for LG7. For the rest of the 10 SSRs this information could not be identified as they have not yet been placed on any genetic map. Among these 33 SSRs, only 24 were polymorphic while the rest were monomorphic, producing a total of 47 different bands (Table 2). The SSR primers amplified in almost all species of the genus Rosa, with quite a few null alleles (278 out of 1316) with an average of 21%. Still a major portion of the markers (79%) exhibited cross transferability among different species in the genus Rosa. The number of alleles ranged from 1 to 3 per marker locus with an average of 1.42 (Table 2). PIC value is an important attribute of a primer which describes its ability to differentiate various individuals. With an average PIC value of 0.365, highest PIC value (0.87) was observed for SSR Rw60A16 followed by RA027a (0.805) and H4F06 (0.793), while lowest values was found for SSR, Rw5G14 (0.102). With average marker index (MI) of 1.253, highest MI value was observed for Rw29B1 (5.2), followed by Rw52D4 (4.88) and H20D08 (4.72), while lowest MI value (0.654) was observed for SSR, H5F12. Resolving power (Rp), the discriminatory potential of a marker to distinguish the genotypes or individuals, was the highest for Rw60A16 (1.6), followed by RA027a (1.55), and C187 (1.26), while the lowest was for Rw5G14 (0.11) with the average Rp being 0.416. High level of variations was detected by primers like Rw60A16, RA027a, H4F06, C187 and Rw34L6 (Table 2).

Table 2 Details of the 33 SSR markers amplified in terms of their marker informativeness revealing the discriminatory power of each primer

Genetic diversity and relationship between Rosa species by cluster analysis using UPGMA method and Jaccard’s similarity coefficients

Genetic similarity among 28 rose species as determined by Jaccard’s pairwise similarity coefficient (Supplementary Table 3) ranged from 0.33 to 1.00 indicating moderate diversity. Maximum similarity with similarity coefficient of 1.00 was observed between three pairs of species, R. brunonii with R. dumalis; R. macrophylla with R. hybrida cv. Rose Sherbet; and R. tomentosa with R. damascena cv. Jwala. Further, R. chinensis ‘Viridiflora’ showed higher similarity with R. macrophylla and R. hybrida cv. Rose Sherbet (0.97). Maximum dissimilarity was observed between R. tomentosa and R. slancensis with similarity coefficient of 0.33. R. brunonii, a species native to India, exhibited great divergence from R. multiflora and R. damascena cv. Rani Sahiba.

Based on the Jaccard’s pairwise similarity coefficients, unweighted pair-group method with arithmetic averages (UPGMA) based clustering was done and a dendrogram depicting the genetic relationships among the Rosa species was constructed. It grouped 28 accessions into five major clusters. Two main clusters branched at similarity coefficient value of 0.56 (Fig. 1). Cluster I consisted of R. nitida × R. rugosa, Rosa species 2 and Rosa species 4. Cluster II was the largest, consisting of eleven species, and were further divided into two sub-clusters; first sub-cluster included five species viz., R. macrophylla, R. hybrida cv. Rose Sherbet, R. chinensis ‘Viridiflora’, R. damascena cv. Himroz and R. damascena cv. Rani Sahiba, and second sub-cluster included R. glutinosa, R. bourboniana and R. rubiginosa. Three species viz., R. indica major, R. moschata and Rosa species 1 were part of the clusters but placed singly. Cluster III consisted of six species separated into two sub-clusters; R. wichuraiana, R. multiflora, R. inodora, and R. rubriflora were placed in the first sub-cluster while R. damascena (exotic collection) and R. odorata were placed in another sub-cluster. Grouping of R. damascena and R. odorata in one sub cluster was also reported by Singh et al. (2017) using RAPDs and AFLPs. Cluster IV included R. slancensis, R. hybrida cv. Dr. Huey and R. banksiae while cluster V included the remaining five species namely R. brunonii, R. dumalis, R. tomentosa, Rosa species 3 and R. damascena cv. Jwala (Fig. 1).

Fig. 1
figure 1

Genetic relationship of the 28 Rosa species based on the similarity index values generated from the 24 poymorphic SSR markers using the Jaccard’s similarity coefficient matrix and constructed by UPGMA method

Principal component analysis

As principal component analysis helps in identifying the most relevant characters and presents them in more interpretable and more visualized dimensions through a linear combination of variables that account for most of the variation present in the original set of variables, we processed data for PCA. The first five principal components captured more than 80% variability (Standardized loadings value based upon correlation for PCA is given in supplementary Table 4). The first component accounted for 32.5% of variability while each succeeding component exhibited 18.2%, 14.5%, 8.7% and 7.4% variability. The principal components separated the rose species into four major clusters. The grouping was different from the cluster analysis especially in cluster 5, for two pairs of species, R. brunonii and dumalis which were placed together; and R. tomentosa and R. damascena cv. Jwala which were also placed together (Figs. 1 and 2). However, according to PCA these species were distantly placed even-though in the same coordinates (Fig. 2 and supplementary Table 5). Thus, compared to the UPGMA based clustering, PCA explained the variations better.

Fig. 2
figure 2

Principal component analysis of the 28 Rosa species using 24 SSR markers which resolved the accessions in five principal components with the first two components explaining > 50% of the variation

Discussion

Despite numerous taxonomic studies of this well-known genus Rosa (Ku and Robertson 2003; Lewis 1957; Rehder 1940; Wissemann 2003; Yü et al. 1985), species relationships within Rosa still remain problematic. Phylogenetic classification among species has been very difficult because of intra-specific variability, polyploidy, inter-specific hybridization (Koopman et al. 2008; Wissemann 2003) and publication of numerous names given to morphological variants and hybrids. Still, the subgeneric and sectional classification system of Rehder (1940) is often used (four subgenera, ten sections). The classical phylogenetic approach relies on morphological characteristics of an organism but as Rosa has a wide and overlapping range of morphological variations and is influenced by environment, classification based on morphology alone is not adequate (Lewis 1957). Wild roses are highly valuable as germplasm being the reservoir of various genes responsible for important ornamental traits.Although some wild species like R. damascena, R. bourboniana, and R. moschata have been used for essential oils, most of them are underutilized and have not yet realized their potential (Singh et al. 2017). The proper knowledge on genetic diversity in rose species can help breeders to understand the germplasm characteristics and help to breed the best progeny.

As morphological analysis based on DUS descriptors in our previous study (Gaurav et al. 2018) alone was not sufficient to characterize the Rosa species, SSR markers developed by various researchers were used over other molecular markers due to high reproducibility, a high degree of polymorphism, co-dominant inheritance and uniform distribution in the genome (Tribhuvan et al. 2019). As they are conserved among the closely related species, they could provide a marker database for phylogenetic and evolutionary studies. The present study reports an identification method for the discrimination of cultivated and wild species of rose.

The SSR markers, in general, were amplified in almost all the used species of the genus Rosa, with few null alleles which showed the transferability of these SSR markers between different species in the genus Rosa. Transferability of SSR markers between related species has already been reported in Rosa genus by Hibrand-Saint Oyant et al. (2008), Samiei et al. (2009) and Zhang et al. (2006), but all these studies involved comparatively a smaller number of genotypes from different species or fewer markers. Transferability of markers is also supported by the fact of that the SSR markers have been developed only from R. wichuraiana and R. hybrida but amplified in almost all the genotypes used in the study which comprised of 25 different cultivated and wild species.

A great level of polymorphism was observed by the selected SSR markers in rose species which might be due to the difference in ploidy level, natural hybridization, mutations, recombination, and random segregation of heterozygous chromosomes during meiosis (Kaul et al. 2009; Nadeem et al. 2014). High level of variations was detected by primers like Rw60A16, RA027a, H4F06, C187 and Rw34L6 which could be very useful in fingerprinting. Nadeem et al. (2014) also analysed diversity of Rosa hybrida L. by SSR markers and detected a high level of variation with the locus H10D03, CL2996 and Rw54N22. Of these, in our study only H10D03 has been utilized which was found to be monomorphic. PIC and Rp are reported to be better than MI in order to describe the discriminatory power of a primer to distinguish various genotypes or individuals (Shingote et al. 2019). Kimura et al. (2006) reported average value of the power of discrimination being 0.79, ranging from 0.29 to 0.95 while analysing diversity of 24 modern garden roses using 13 SSR markers. Singh et al. (2017) reported the high range of genetic diversity (0.25−0.90) among rose genotypes using RAPD and AFLP markers. Further, they also reported wide genetic diversity among different species such as R. brunonii, R. multiflora and R. damascena of rose.

The grouping of genotypes together in a cluster may occur due to common parentages (descent), phenotypic similarities among the species and genetic makeup of genotypes (identity by state) and gene pool. R. damascena is primarily cultivated for oil extraction, largely in India, Pakistan and Iran. Greater diversity was observed among the accessions of this species namely R. damascena exotic collection and its cultivars Rani Sahiba, Jwala and Himroz. Himroz was in a separate cluster than that of Rani Sahiba, Jwala and R. damascena. Rai et al. (2014) also reported R. damascena as a highly diverse species where Jwala and Rani Sahiba fell in a separate cluster from Himroz. Jwala and Rani Sahiba are suitable for cultivation in sub-tropical plains and lower altitudes whereas Himroz is suitable for cultivation in temperate areas and slightly higher altitudes. This indicates that Jwala and Rani Sahiba were selections from similar/related land races of Rosa damascena. Though, the four wild Rosa species, FLS/2016/RW1 (Rosa species 1), FLS/2016/RW2 (Rosa species 2), FLS/2016/RW3 (Rosa species 3) and FLS/2016/RW4 (Rosa species 4), could not be assigned to any cultivated and known species. Rosa species 2 and Rosa species 4 were grouped with R. nitida × R. rugosa supporting the hybrid origin of these accessions. Similarly, Rosa species 3 clustered with R. brunonii and R dumalis at a similarity level of 0.75, which suggest they might be involved in its parentage. However, Rosa species 1 clustered distinctly and hence could not be assigned parentage. Singh et al. (2017) also reported that some strains exhibiting typical characteristics of a species may not align to that particular species group which could be due to natural out-groups present within a species.

Section Synstylae of Rehder’s classification includes four species viz. R. multiflora, R. brunonii, R. moschata, and R. wichuraiana. But, in our study, only two- R. multiflora and R. wichuraiana were placed in the same cluster while R. brunonii and R. moschata, both the Indian origin types, were present in completely different clusters. Thus, the present study supported Rehder’s classification partially. Some researchers considered section Synstylae to be monophyletic (Matsumoto et al. 2000; Wu et al. 2000, 2001), however, our study, strongly supported polyphyletic origin as the four species were found in three different clusters showing huge variation. Koopman et al. (2008) also reconstructed relationships in Rosa species by using various methods like UPGMA clustering, Wagner parsimony, and Bayesian inference by applying AFLP markers and suggested section Synstylae as polyphyletic. But it was also considered that the data to be inconclusive for confirmation as to the phylogenetic structure of sect. Synstylae. Our study, thus, does add support to the polyphyletic origin with newer set of markers. R. brunonii exhibited high similarity with R. dumalis of section Caninae, while R. inodora of section Caninae was in the same clade as of R. multiflora, exhibiting easy out-crossing among both the sections. R. glutinosa, R. rubiginosa, R. inodora, R. rubrifolia and R. tomentosa are the other members of section Caninae and fell into different clusters, although in accordance with Rehder’s arrangement R. glutinosa and R. rubiginosa; similarly R. inodora and R. rubrifolia were grouped together. Our studies also support the view as section Caninae not being monophyletic (Koopman et al. 2008; Olsson et al. 2000; Wissemann 2000).

In agreement with Rehder’s classification, the members of the section Indicae, R. indica major, R. chinensis ‘Viridiflora’ and R. bourboniana fell in the same cluster except for R. odorata. Cultivar Rose Sherbet falls into same cluster as that of R. bourboniana. Variety Rose Sherbet is a seedling of “Grussan Teplitz” which is widely cultivated in Pakistan and Iran for essential oil extraction and locally known as Surkha which has been classified either as Hybrid China or Bourbon rose or a mix of both (Nawaz et al. 2011; Farooq et al. 2013). Both Rose Sherbet and R. bourboniana exhibited closeness to R. chinensis ‘Viridiflora’. Rose Sherbet is a repeat flowering variety confirming the role of R. chinensis in its parentage. It is a known fact that R. bourboniana was derived from a cross between R. chinensis and R. damascena (Rehder 1940). This indicates the usefulness of SSR markers for the study of parent-progeny relationship or lineage. Dr. Huey, which is a hybrid wichuriana, has shown a lot of variation from its ancestor R. wichuriana as they fell into separate clusters. The species of Rosa genus are thus highly variable, owing to easy cross-hybridization and hence, the conception of species varies greatly according to the views of different taxonomist starting from the highly rated Rehder’s classification.

We compared the five clusters obtained using SSRs with our previous data obtained based on 18 morphological (DUS) descriptors on 28 Rosa spp. of which the material of the present study formed a subset (Gaurav et al. 2018). We did find a lot of similarities (14/24 known Rosa spp,) in the clusters obtained - seven entries in the largest cluster (II) of SSR matched with that of grouping obtained using DUS descriptors (R. macrophylla, R. chinensis ‘Viridiflora’, R. damascena cv. Himroz and R. damascena cv. Rani Sahiba, R. glutinosa, R. indica major, and Rosa species 1.); similarly 3/6 and 4/5 entries in clusters III (R. wichuraiana, R. multiflora, R. rubriflora) and V (all the four unknown accessions of Rosa spp.) did coincide with the our earlier clustering. Interestingly, the higher degree of similarity between R. brunonii and R. dumalis was supported by both molecular and morpholgical (DUS) data while that between R. macrophylla and R. hybrida cv. Rose Sherbet as shown in the present study could not be supported by the DUS data. This could be due to the reason that the SSRs used in the study were largely R. hybrida specific and not R. macrophylla specific suggesting that to resolve the genetic relationships better, it would be worthwhile to generate novel SSRs from other Rosa spp, in addition to that of from R. hybrida and R. wichuriana. In case of the unknown wild spp., the grouping by DUS descriptors in the previous study (Gaurav et al. 2018) was completely at disagreement with that of the present study except for the grouping of wild species 3. Thus, grouping of the unknown collections need more support from morphological, molecular and cytological markers.

In conclusion, SSR markers used in the study were highly informative and robust at the species level as well as Rehder’s section level in the genus Rosa, with a few notable differences. SSR markers did show cross-transferability across Rosa species and enough discrimination power to group them. The grouping based on just 24 genome-wide markers mostly matched with the grouping based on 18 parameters including plant growth habit, shoot, leaf, stipule, pigmentation and prickle characteristics. The 24 markers identified to be polymorphic in this study can be used for typing the rose germplasm collections of the country and used in genome-wide association studies. These microsatellites (SSRs) can be used to protect wealth of various species grown in our country and for pedigree analysis. They can be utilized to sort out the anomalies in rose taxonomy and further enlighten us about the origin of important varieties. Further studies are required to verify these markers in other cultivated and wild species of rose.