Introduction

Corema album (L.) D. Don belongs to the Ericaceae family and occurs naturally on sand dunes of the Atlantic coast of the Iberian Peninsula (ssp. album), and in the Azores Islands on volcanic lava and ash fields (ssp. azoricum Pinto da Silva). Besides its great ecological importance (Guitián et al. 1997; Zunzunegui et al. 2006), its edible white berries have been highly appreciated and exploited (Oliveira and Dale 2012) due to the high nutritional value and important antioxidant properties. These berries contain high amounts of anti-oxidants, phenolic acids, flavonoids and low amounts of anthocyanins (León-González et al. 2013; Andrade et al. 2017b) and are also an important source of fibbers and sugars (Andrade et al. 2017a).

Despite the important properties with potential benefits on human health, this species is still poorly exploited from a commercial point of view, being still harvested from the wild without any cultivation. However, there is an increasing demand for berries with a distinct white colour. Therefore, the establishment of this species as a new crop with fruit of high quality for responding to consumer demands and exploring market opportunities is a challenge.

The characterization of DNA based-markers provides information on the genetic diversity and relationship both within a population (intragroup diversity) and among different populations (intergroup diversity) which provide useful knowledge for breeding selection (Agarwal et al. 2008). Inter-simple sequence repeat (ISSR) markers have shown great potential for assessing the genetic diversity of wild species and structuring of natural populations (Zietkiewicz et al. 1994; Ueno et al. 2015; Zoratti et al. 2015). The ISSR technique is simple, fast and has high reproducibility, low costs and no genome knowledge is need for its implementation (Reddy et al. 2002). These markers have also been used in genetic studies on Vaccinium populations, which also belong to the Ericaceae family (An et al. 2015; Debnath 2007, 2009; Gawroński et al. 2017; Yakimowski and Eckert 2008). For C. album, only a molecular approach using ISSR was carried out among male and female plants to identify a putative sex-specific marker (Nóbrega et al. 2016).

The present research work employs a survey conducted in wild populations along its distribution area for evaluation of genetic diversity using ISSR markers.

Materials and methods

Plant material and sampling sites

We sampled 20 female C. album plants drawn from four populations (Table 1) sited at different biogeographical units along the Atlantic coast of Portugal. For all plants, geographical coordinates were recorded using the global position system (GPS).

Table 1 Origin (collection site) and geographical coordinates of C. album populations

In all populations, the vegetation cover was dominated by C. album shrubs and Pinus pinaster Ait. was present further inland of the dune systems, except in Monte Clérigo. In Comporta and Meco, the populations were on the interface between the dunes and the pine woodland, whereas the population of the Duna de Quiaios was under the canopy of the pine woodland.

Molecular data

DNA isolation and amplification

Fresh and healthy leaves were ground to a fine powder in liquid nitrogen using pre-cooled mortar and pestle and then stored at − 80 °C until use for DNA extraction.

Total genomic DNA was isolated from approximately 100 mg of leaf powder using the DNeasy Plant Mini Kit (QIAGEN, Hilden, Germany) following the manufacturer’s instructions. The concentration of DNA was estimated with the NanoDrop 2000 UV–Vis Spectrophotometer (Thermo Fisher Scientific, Massachusetts, USA).

A set of 20 ISSR primers were screened. Of these, six primers were selected based on their reproducibility and levels of polymorphism and used for final analysis. Primers used with their respective sequences and annealing temperature were showed in Table 2.

Table 2 List of primers used for ISSR amplification, sequences and annealing temperatures

PCR reactions were performed in a 25 μl final volume of using the Promega GoTaq Flexi DNA Polymerase Kit (Promega, Madison, USA), containing 1 μl template DNA, 5 μl GoTaq Flexi PCR buffer, 1.5 mM MgCl2, 0.20 mM each dNTPs, 1.25 U GoTaq Flexi DNA Polymerase (Promega, Madison) and 0.4 μM of each primer in a Biometra Tgradient thermocycler (Biometra, Gottingen, Germany), in a Biometra Tgradient thermocycler (Biometra, Göttingen, Germany) (Nóbrega et al. 2016).

Amplification profile consisted of an initial denaturation of 3 min at 94 °C, followed by 40 cycles for 1 min at 94 °C, 1 min at the annealing temperature (Table 2), 2 min at 72 °C and a final extension for 10 min at 72 °C. Amplicons were separated by electrophoresis at 5 V cm−1 in agarose gel (1.5%) containing 0.5 g/mL ethidium bromide and 1 × TBE running buffer.

In order to have a representative survey for the distribution area, 20 accessions by population were sampled, as described before.

However, the DNA from three plants from Meco, two from Monte Clérigo and four from Duna de Quiaios, was not able to get proper PCR amplification, showing reduced yields. Thus, although we aimed to analyse 80 accessions, due to low amplification quality, some samples had to be excluded and only 71 genotypes were included in the assessment of the genetic diversity.

Morphological data

A total of 10 quantitative phenotypic traits were assessed in detail, during flowering (March to May) and fruiting season (August and September) in all plants. The traits list included the most important leaf, flower and fruit features: plant volume, flower/inflorescences ratio, number of leaves per whorl, length of annual growths, branching number, number of fruits with more than 10.25 mm in diameter, number of fruits with diameter between 10.25 and 7.50 mm, number of fruits with less than 7.50 mm in diameter, percentage of white fruits and fruit dry/fresh weight ratio.

From the 10 traits, only eight were used, for further analyses, as it was explained in the in the statistical analyses part of the morphological variation, due to colinearity among some of them and also to their agronomic relevance. Thus, number of fruits with diameter between 10.25 and 7.50 mm and with less than 7.50 mm were discarded.

Statistical analyses

Genetic diversity

The genetic diversity of four populations was analysed using six ISSR selected primers.

Scoring of ISSR amplification product sizes was carried out by considering only the clear and unambiguous bands. The results were transformed into a binary presence (1)—absence (0) matrix. Genetic diversity was estimated by three indexes calculated for each ISSR marker: the polymorphism information content (PIC) (Roldán-Ruiz et al. 2000), the resolving power (RP) (Prevost and Wilkinson 1999) and the marker index (MI) (Powell et al. 1996) and by the parameters: number of bands, number of polymorphic bands and percentage of polymorphic loci.

The data obtained by scoring the ISSR profiles of different primers were subject to cluster analysis using the 71 × 71 matrix of the Nei-Li dissimilarity coefficient (Nei and Li 1979) ran in the stats R package (R Core Team 2013). The dendrogram was performed using UPGMA clustering method (Unweighted Pair-group Method Analysis) by the factoextra R package (Kassambara and Mundt 2017). The genetic structure and the differences between populations were evaluated through a Principal Coordinate Analysis (PCoA) using the vegan R package (Oksanen et al. 2013). A hierarchical analysis of molecular variance (AMOVA) was also performed by using pegas R package (Paradis et al. 2018).

Morphological variability

Kruskall–Wallis test among the four populations, at a significance level of α = 0.05, was performed in all traits assessed. A Fisher's post hoc test (α = 0.05) for mean separation was also conducted. The agricolae R package (De Mendiburu 2019) was used.

In order to seek relations between variables in all plants of the four populations (n = 71), Spearman’s correlation (α = 0.05) was used from Hmisc R package (Harrell 2014).

A previous selection of the traits was made regarding their correlations and their agronomic interest. Eight traits were used to perform principal component analyses (PCA), with factoextra R package, in order to find which traits differentiate better each population (Kassambara and Mundt 2017).

A dissimilarity matrix was calculated from the eigen values extracted from the first three PCA axis, using the Mahanalobis distance, like it was performed in Pereira-Lorenzo et al. (2012). This was achieved by using StatMatch R package (D’Orazio and D’Orazio 2019). Dendrogram was built also using the UPGMA method by means of the factoextra R package (Kassambara and Mundt 2017). To evaluate the relationship between matrices of genetic and morphological traits dissimilarities the Mantel test’s correlation, were calculated using the vegan R package (Oksanen et al. 2013).

Results

Genetic diversity

All the six ISSR primers produced distinct reproducible polymorphic banding patterns with a total number of 51 scorable bands, ranging from 6 to 10 bands per primer, with an average of 8 bands per primer (Table 3). Amplification products ranged from 500 bp to 2.0 kb.

Table 3 Genetic diversity estimates of C. album populations including NB: number of bands; NPB: Number of polymorphic bands; PPB: Percentage of polymorphic bands; PIC: Polymorphism information content; MI: Marker index; RP: Resolving power and ISSR populations

The Polymorphism Information Content (PIC) values of the primers varied between 0.16 (primer M13) and 0.37 (primer UBC855) with an average of 0.31. The highest Marker index (MI) was showed in primer M13 (17.41), with an average of 10.1 per primer. The highest resolving power was from UBC840 (17.24), with an average per primer of 11.26. Regarding ISSR banding profiles the genotypes from Meco and Comporta presented higher diversity than the genotypes from Monte Clérigo and Duna de Quiaios (Table 3).

The dendrogram issued from a cluster analysis based on 51 ISSR markers (Fig. 1). Five distinct clusters were identified. Cluster four consisted of accessions from Duna de Quiaios and cluster five with the accessions from Monte Clérigo. Accessions from Meco were dispersed in all except cluster two and Comporta was dispersed in clusters one, two and three.

Fig. 1
figure 1

Dendrogram obtained with UPGMA method using the Nei and Li coefficient for 71 plants of the four populations (r = 0.729)

PCoA analysis performed on the ISSR Nei-Li dissimilarity matrix among all genotypes is presented in Fig. 2. Axis 1 and 2, contributed with 22% of the variation observed. Accessions from Duna de Quiaios, Comporta and Monte Clérigo were clearly discriminated. Accessions from Meco overlapped Monte Clérigo and the Comporta populations. Comporta was the most scattered population and Duna de Quiaios had a smaller group.

Fig. 2
figure 2

Principal Coordinates Analyses (PCoA) of the molecular data, which explained 22% of the variation amongst populations

AMOVA analysis showed that the proportion of variation attributable to among populations was high (62.40%) whereas only 37.6% occurred within populations.

Morphological variability

The Kruskal–Wallis test performed in the 10 morphological traits, showed significant differences among populations (Table 4). Vegetative traits were the ones that had more significant differences among the four sites. The number of fruits in each of the three classes was higher in Meco and Comporta, although no significant differences were showed in the two higher fruit classes. The percentage of white fruits was only significantly different in Monte Clérigo. The ratio of dry/fresh fruit weight was higher in Quiaios and lower in Comporta, which means that the relation pulp/seed size was higher in Quiaios.

Table 4 Summary table regarding morphological traits mean differences, in the four populations

Spearman’s correlation showed higher significant correlations in the number of branches and number of fruits with diameter between 10.25 and 7.50 mm, with the other variables (data not showed), therefore both traits were excluded to perform the Principal Component Analyses (PCA). The rest of the traits showed smaller or non significant correlations.

The first three axis of the PCA accumulated 62.1% of the trait variation (Fig. 3). In PC1, length of the annual growths, ratio of flowers per inflorescence and white fruit percentage had more influence in this axis (25.2%). Plant volume, number of fruits with more than 10.25 mm in diameter and dry/fresh fruit weight, had a positive load in PC2 (24% of the variation). Finally, in PC3 which accumulated 12.9% of the variation, was influenced by the number of fruits smaller than 7.5 mm in diameter and the number of leaves per two whorls.

Fig. 3
figure 3

Principal component analysis (PCA) showing the dispersion of the individual of the four populations sampled, as well as what were the traits that had the biggest influence on each population. Vl = Plant volume, RFI = flower/inflorescences ratio, NLW = Number of leaves per two whorl, LAG = Length of annual growths, BN = Branching number, HFC = Number of fruits with more than 10.25 mm in diameter, PWt = Percentage of white fruits, RP = fruit dry/fresh weight ratio

Cluster analysis based on the Mahalanobis distance, calculated as dissimilarity measure between the 71 genotypes, from the standardized three first principal components (PCs), and clustered by UPGMA, and produced the dendrogram presented in Fig. 4. The morphological traits varied widely and the highest phenotypic diversity was observed in Comporta and Meco. However, this general distribution of the genotypes did not reflect the geographical origin.

Fig. 4
figure 4

Dendrogram obtained with the UPGMA method using the Mahanalobis distance of the morphological data of the 71 plants. Each colour represents a cluster (r = 0.791)

Mantel test between the morphological and molecular dissimilarity matrices showed a low correlation (r = 0.139; p = 0.004).

Discussion

The genetic improvement of any crop is dependent on the utilization of well characterized wild relatives and breeding techniques. The assessment of genetic diversity is a requirement to select high yielding genotypes.

Fifty alleles were identified by six ISSR markers proving their ability to be used as polymorphic markers in C. album accessions. Although ISSR markers have been extensively used to assess genetic diversity, there are no reports in Corema genus, which include only two species. Thus, to our knowledge this is the first study on the assessment of the molecular genetic diversity in this species.

Regarding molecular data the clustering analyses showed that Meco and Comporta were the populations with more dispersed accessions in the dendrogram, which could lead to a higher genetic diversity compared to Monte Clérigo and Duna de Quiaios.

Both the PCoA analysis and phylogeny reconstruction based on ISSR markers reveals a clustering pattern consistent with geographical location and suggestive of response to environmental conditions variation. In fact, the PCoA analysis showed that the Comporta and Meco populations present a higher dispersion compared to other populations. This analyses showed three distinct groups (Monte Clérigo, Comporta and Duna de Quiaios), and an overlap of Meco with Monte Clérigo and Comporta. The northern population (Duna de Quiaios) formed a different group from the three southern populations. However, the overlapping between Meco and Monte Clérigo was unexpected, since the population which was geographically closer to Monte Clérigo was that of Comporta. A putative reason to this might be linked to seed dispersers: some studies indicate that one of the main disperser of C. album were seagulls (Calviño-Cancela 2002, 2004), and their capability to travel long distances could favour gene flow among populations (Calviño-Cancela 2011).

The values of the AMOVA were different from what was observed in other species of the Ericaceae family. Values were higher for within population variance, also using ISSR markers, in Vaccinium myrtillus (Zoratti et al. 2015), Vaccinium angustifolium (Debnath 2009) and Vaccinium vitis-idaea (Debnath and Sion 2009), probably due to self-pollination capabilities in Vaccinium, in turn, absent in Corema. C. album is cross-pollinated through wind (Álvarez-Cansino et al. 2010; Guitián et al. 1997) and with a low germination percentage (Calviño-Cancela 2004), leading to a higher variation between populations.

The first three axis of the PCA explained 62.1% of the variation, regarding variability among agro-morphological traits. Although this was not a high percentage, similar results were achieved in other species, regarding morphological traits (Pereira-Lorenzo et al. 2012; Ciarmiello et al. 2015; Kouakou et al. 2018). All four populations partially overlap, Comporta being the population with greater morphological diversity. Monte Clérigo and Quiaios had almost symmetric distribution, even though they slightly overlap.

Comporta had plants with bigger volumes and fruits with bigger sizes. Quiaios had a big influence regarding the ratio between dry/fresh fruit weight and the length of annual vegetative growth.

The cluster analysis of morphological traits revealed no distinct separation among the C. album genotypes growing in different geographic areas. This variability within and among populations was also found in other studies (Burgos et al. 2018). The morphological traits of the species did not show any accordance to its geographical distribution as, per example, Solouki et al. (2008) found in Matricaria chamomila.

The Mantel test between morphological and molecular data yielded a quite low correlation value, similarly to other studies (Fanizza et al. 1999; Allel et al. 2017; Giordani et al. 2017; Burgos et al. 2018). Several factors might justify the somewhat unexpected low correlation. Molecular markers could be covering parts of the genome with coding and non-coding regions and could be less subjected to selection pressures compared to morphological traits (Burgos et al. 2018; Semagn 2002). A correlation of morphological traits with environmental conditions, but with no genetic correspondence, could mean distinct phenotypes that are not distinct genotypes (Johns et al. 1997). Such discordance might relate to evolutionary and biogeographical processes that are way off the aims the current analysis (see also Martins et al. 2006).

Conclusion

The analyses provided by the ISSR makers, assembled the accessions regarding their geographical distribution, but most importantly, genetic diversity was found. Genetic diversity was higher between populations and lower within. The morphological traits, showed high morphological diversity, but no signal regarding geographic localization. The Mantel test between genetic and morphological data had a low correlation. Comporta and Meco had higher molecular and morphological diversity, being the chosen populations for future plant material selection.

As the first approach, to our knowledge, to genetic and morphological diversity characterization of C. album in Portugal the study reveals a high degree of diversity among the accessions which can be further used for crop improvement. This may provide an opportunity to enhance and boost the breeding strategy.