Introduction

Gene banks play a critical role in the conservation of agro-biodiversity at national and international levels (Rao et al. 2006; Agrawal et al. 2007). Germplasm of diverse plant species is maintained in gene banks worldwide, with individual collections holding anywhere from hundreds to tens of thousands of accessions (FAO 1997). The conservation, management, and use of germplasm maintained in gene banks pose many challenges genetic resources conservators and researchers using the collections. Central to sustainable conservation and effective use of the collections for breeding and other research is the knowledge of the genetic diversity and taxonomy present in the gene bank. Hence, the characterization of the accessions maintained in the collection and the examination of the genetic relationship between them is important for the efficient conservation and use of the collections (Teklu et al. 2006).

Daucus carota L. is a morphologically diverse species found in wild or feral form throughout the Mediterranean, southwest Asia, Africa, Australia, New Zealand and the Americas (Peterson and Simon 1986; Vaughan and Geissler 2009). The gene centers for D. carota include Asia Minor, Transcaucasia, Iran, Turkmenistan, northwest India, Afghanistan, Tadjikistan, Uzbekistan and western Tian-shan mountain system of central Asia (Bradeen et al. 2002). Carrot domestication likely occurred in Central Asia (Iorizzo et al. 2013), but among Mediterranean regions, Tunisia is considered a center of biodiversity for Daucus and many other crops because of the diverse ecosystems and climatic conditions (Pottier Alapetite 1979; Le Floc’h et al. 2010), and is the center of diversity for important members of the D. carota clade, D. carota subsp. capillifolius and D. sahariensis. This clade is supported by a series of molecular studies (Spalik and Downie 2007; Arbizu et al. 2014a; Banasiak et al. 2016; Spooner et al. 2017), and is defined by shared chromosome numbers of 2n = 18 and intercrossability of its component members (e.g., Krickl 1961; McCollum 1975, 1977; Vivek and Simon 1999; Hauser and Bjørn 2001), allowing for gene transfer of great potential economic use by carrot breeders.

The latest taxonomic monograph of Daucus by Sáenz Laín (1981) recognized 20 species, and Rubatzky et al. (1999) later estimated 25 species. In Tunisia, Pottier Alapetite (1979) recognized 11 Daucus species with several subspecies, to which we add D. carota subsp. capillifolius, described from western Libya, but found by us in adjacent eastern Tunisia. Many tools are now available for studying variability and the relationships among accessions including total seed protein, isozymes, molecular markers, and DNA sequence data (May 1992; Freville et al. 2001; Schlötterer 2004; Spooner et al. 2005). Daucus has been studied by a variety of DNA sequence techniques (Spalik and Downie 2007; Spooner et al. 2013, 2017; Arbizu et al. 2014a, 2016a, b; Banasiak et al. 2016). However, in addition to these powerful molecular methods to help define phylogenetic relationships, knowledge of the phenotype is critically important for practical taxonomic identifications and traits of use for plant breeders (Sudré et al. 2010). Although many floristic treatments have been published in the past few decades, many of Daucus species, and most critically for this study the subspecies variation in D. carota, are not fully understood. Identifications frequently use different characters and character states in their taxonomic keys and have incomplete synonymies which preclude comparison of their taxonomic concepts and ambiguous identifications. Recently, attempts were made by Spooner et al. (2014) and Arbizu et al. (2014b) to quantify and describe the morphological variation in the Daucus collection conserved at the US genebank at the North Central Regional Plant Introduction Station (NCRPIS), for wild and cultivated carrots. We here examine the morphological diversity within the Tunisian National GeneBank collections of the D. carota clade at the inter- and intra-specific levels, using a larger morphological dataset than done previously with just fruit characters (Mezghani et al. 2014). We also expand the study of Mezghani et al. (2014) by assessing 32 morphological characters related to the full range of plant parts to include leaves, stems, and flowers, and in addition analyze the data from all individuals, not averaged data as in all prior studies. The result of this study adds to a growing body of morphological data in Daucus to aid in defining these taxa, will serve to produce an updated monograph of the genus, and will aid in the management, conservation and use of genetic resources.

Materials and methods

Plant material

We examined nine individuals each from 45 accessions (405 individuals in total) of the Daucus carota clade derived from a collection of 103 accessions conserved at the National Gene Bank of Tunisia (NGBT). The accessions were selected to maximize the diverse geographic and bioclimatic areas present in Tunisia, based on our previous study (Mezghani et al. 2014). We also added two newly collected accessions of D. carota subsp. gummifer from Galite Island, Tunisia. Geographic and bioclimatic data related to the accessions are shown in Table 1. Geographic Positioning System coordinates were imported into DIVA-GIS program version 7.5 (http://www.diva-gis.org/) and served as input data for mapping the accessions (Fig. 1).

Table 1 Taxon name, origin, and bioclimatic zones of the Daucus populations studied here
Fig. 1
figure 1

Collection sites of the accessions examined in Tunisia

Experimental field trial

The experiment was conducted in 2015 under field conditions at the High Institute of Agronomy of Chott Mariem (35.1182 N; 107297°E) using a randomized block design, with 45 accessions, two replications and 5 plants per plot. Two 1-m row of each accession was directly and manually seeded in the field. Spacing was 3 m between blocks, 1.5 m between rows, and 0.2 m between plants. During culture, agronomic practices including irrigation, weeding, and fertilization were conducted uniformly as required in all plots.

Characters recorded

Daucus accessions were scored for 10 quantitative (continuous) and 22 qualitative (discontinuous) traits related to leaf, stem and flower (Table 2), using nine randomly selected plants per accession. The selection of characters was made following the International Board for Plant Genetic Resources guidelines for wild and cultivated carrot (Daucus) descriptors (IPGRI 1998), and adding additional characters used by Pottier Alapetite (1979) and Sáenz Laín (1981). Some of morphological characters were measured at the sampling location and others for microscopic observation in the laboratory. Quantitative traits were measured with a ruler or caliper while qualitative characters were based on scoring and coding according to the IPGRI (1998) descriptors. Images of different plant parts were made in the field and in the laboratory with a digital camera and images are available on the NGBT data base. Herbarium vouchers of the different species and subspecies are deposited at the NGBT herbarium.

Table 2 Morphological descriptors, descriptor states, their codes for numerical analysis, frequency distribution, and diversity index of Daucus accessions listed in Table 1

Data analyses

Twenty-two of the 32 characters were scored and analyzed as discontinuous variables; the remaining ten were treated as continuous (Table 2). The operational taxonomic unit (OUT) was the individual, not the means of accessions as in our prior analyses (Mezghani et al. 2014; Spooner et al. 2014), precluding the need for assessing derivatives of averages or modes, and providing a larger and statistically more reliable data set.

For calculating the diversity parameters only, the overall entry mean value and the standard deviation were used to convert quantitative characters into qualitative ones (Jaradat et al. 2004) and frequencies were obtained from class intervals. The diversity was measured for each morphological character by using the standardized Shannon–Weaver (Shannon and Weaver 1949, as referred by Al Khanjari et al. 2008) diversity index, designed as H’ (H’ = −∑ pi (log2 pi)/log2 n, where pi = frequency proportion of each descriptor state, n = number of states for each descriptor) was classified as high (H’ ≥ 0.60), intermediate (0.40 ≤ H’ < 0.60) or low (0.10 ≤ H’ < 0.40) as described by Eticha et al. (2005).

Multivariate analyses were conducted in JMP 10.0.0 software (SAS Institute 2012). We ran three types of analyses to explore the best ways to distinguish the accessions: (1) principal components analysis (PCA) using all of the data; (2) cluster analysis (average similarity, standardizing the data) using all of the data; (3) canonical discriminant analysis (CDA) and stepwise discriminant analysis (SDA) (linear, common covariance) using the ten continuous variables to obtain a model whose variables were significant in correctly identifying accession composition, with characters removed one at a time (if needed) until the model F-test p value was less or equal to 0.05. Scatter plots were then constructed to illustrate character-state distributions of the twelve best qualitative or quantitative characters distinguishing the taxa based on principal components and discriminate analyses.

Results

Diversity analysis

Large natural variation was found among accessions for the majority of traits (Table 2). Accessions were found homogenous only for four characters i.e., umbel type (UT), position of involucral bracts on primary umbel (PIB), anther color (AC) and symmetry of peripheral flowers (SPF). All accessions have compound umbels with deflexed involucral bracts, yellow anthers and symmetrical peripheral flowers. Estimated diversity (H’) for polymorphic traits ranged from 0.26 for foliage coverage (FoC) to 0.98 for flowering pattern within plants (FP) with overall means of 0.75, 0.77 and 0.76 for qualitative, quantitative and grand diversity mean, respectively. High phenotypic variability (H’ ≥ 0.6) was observed for all quantitative characters and 16 qualitative characters. Intermediate variation (0.4 ≤ H’ < 0.6) was observed for petiole shape in transverse section (PSTS) and stem ridging (SR). Low variation indicated the dominance of one character state over the others while high variation indicated equitable distribution of the different states as shown by frequency distribution.

Phenetic analyses

Both the PCA (Fig. 2), using all ten quantitative and 18 polymorphic qualitative characters, and CDA (Fig. 3), using the ten quantitative characters, defined four phenetic clusters: (1) D. sahariensis, (2) D. carota subsp. capillifolius, (3) D. carota subsp. carota, partially intergrading with D. carota subsp. gummifer, and (4) D. carota subsp. sativus, partially intergrading with putative hybrids between D. carota subsp. capillifolius and D. carota subsp. carota. While the CDA used only ten of the 28 total characters, the groups were still separated from each other. The 95% confidence ellipses about the means of the groups overlap only in cluster 4 containing putative hybrids between D. carota subsp. capillifolius and D. carota subsp. carota and D. sativus. PCA and CDA are both ordination techniques, but PCA makes no assumptions of group membership of OTUs. It attempts to portray multidimensional variation in the data set in the fewest possible dimensions, while maximizing the variation. CDA uses assigned groups to derive a linear combination of the variables (here morphological characters) that produces the greatest separation of the groups. As such, it tends to show slightly less overlap of the two taxa in group 3, D. carota subsp. carota and D. carota subsp. gummifer.

Fig. 2
figure 2

Principal components analysis of the accessions examined in Tunisia

Fig. 3
figure 3

Canonical discriminate analysis of the accessions examined here. Ellipses show the 95% confidence ellipse of each mean

The cluster analysis (Fig. 4), however, defined an additional two groups (now six in total), splitting group 3 above into two, D. carota subsp. carota and D. carota subsp. gummifer, and group 4 into two, D. carota subsp. sativus, and the putative hybrids between D. carota subsp. capillifolius and D. carota subsp. carota. While the PCA grouped the taxa separately, it did not group the nine individuals per accession together, that is, the individuals were largely intermixed among the different accessions among the larger taxon cluster. Like the PCA, the cluster analysis makes no assumptions of group membership of OTUs. These cluster analyses and CDA results were not significantly different, however, because in both cases where the extra groups are split, they are barely distinguished on the cluster analysis (Fig. 4).

Fig. 4
figure 4

Phenogram of the accessions examined in Tunisia

Character state distributions

All ten of the continuously variable characters were significant (p < 0.001) to discriminate the species based on the discriminate analysis, listed from their most to the least as determined by the F ratio statistic: mean stem length (MSL), average number of umbellets per umbel (ANUU), mean stem diameter at the base (MSDB), mean stem diameter at the extremity (MSDE), total number of umbels per plant (TNUP), mature leaf length (MLL), width of primary open umbel (WPOU), petiole thickness (PT), length of primary basal leaflet (LPBL), and mature leaf width (MLW). Based on high eigenvectors (positive or negative) the following five characters were most important to distinguish the species in principle component 1, listed from their highest to lowest: ANUU (above), density of flowers in umbels (DFU), corolla color (CC), type of involucral bracts on primary umbel (TIB), and leaf shape (LS); and in principle component 2: MSL (above), TNUP (above), anthocyanin coloration in petiole (ACP), petiole and leaf hairiness (PLH), and MLL (above). Four of these characters are repeated (ANUU, MLL, MSL, TNUP), making for 16 quantitative or qualitative characters best distinguishing the taxa analyzed here. Based on ranking the F statistic of the 10 characters from the discriminate analysis, and inspection of character state distributions from the 10 characters as determined by the factor loadings PCA analysis, the best 12 of these 16 characters (eight quantitative and four qualitative) are presented in Fig. 5. These distributions reflect the phenetic results of the PCA and CDA. That is, D. sahariensis is most clearly defined, as is reflected in six of the 12 characters shown in Fig. 5 that have no or very little overlap of characters with any of the other taxa: mean stem length, mean stem diameter at base, average number of umbellets per umbel, mature leaf length, mature leaf width, and leaf shape; easily distinguishing this species as a relatively low-growing plant with small stems, few umbellets/umbel, and small pinnatifid leaves.

Fig. 5
figure 5figure 5

Character state distributions of the 12 best quantitative and qualitative characters distinguishing the taxa examined in Tunisia

Daucus carota subsp. capillifolius (ignoring the putative hybrids) is also well distinguished by its corolla color, leaf shape, and type of involucral bract on primary umbel; essentially easily defining this subspecies as having yellow corollas, bipinnate leaves, and comparatively broad ultimate segments of the involucral bracts on the primary umbel.

Distinguishing D. carota subsp. carota and D. carota subsp. gummifer (as a group) from the other taxa, however, is more difficult and relies on a series of partially overlapping character states (polythetic support). These include mean stem length, corolla color, type of involucral bract on primary umbel, and leaf shape. Distinguishing D. carota subsp. carota and D. carota subsp. gummifer from each other is more difficult. Again, overlapping characters are used, to include shorter stems and more umbellets/umbel. Daucus carota subsp. gummifer (sensu lato) is a highly polymorphic taxon distributed very near the western Mediterranean coasts and the Atlantic coasts of UK, France, Spain and Portugal, and also is distinguished by characters difficult to measure here, such as stiff parts in the peduncles and inflorescences, and often thickened and shiny (varnished) leaves.

Finally, distinguishing D. carota subsp. sativus (cultivated carrot) and the putative hybrids between D. carota subsp. capillifolius and D. carota subsp. carota (as a group) from the other taxa also is difficult and relies on a series of partially overlapping character states. These include mean stem diameter at the base and relatively branched bracts on the primary umbel.

Discussion and conclusions

Information on diversity and relationships within and among crop species and their wild relatives is essential for the efficient utilization of plant genetic resource collections and presents an efficient proxy of taxonomic relationships (Sun and Wong 2001; Drzewiecki et al. 2003). In breeding programs, characterization of accessions based on multiple traits can be used as a management tool in regenerations to allow validating the identity of an accession. Evaluation data are used when searching the gene bank for useful germplasm (DeLacy et al. 2000). International surveys clearly show the need to save and manage the local germplasm of each country, since accessions may contain valuable genes of biotic and abiotic stress for crop breeding (Mengistu et al. 2015).

For Daucus, the taxonomy of the members of the Daucus carota clade remains unresolved (Arbizu et al. 2014b, 2016b). The morphological characters used in our study revealed considerable diversity within a local (Tunisian) Daucus collection maintained at the National Gene Bank of Tunisia as shown by mean diversity indexes of 0.75, 0.77 and 0.76 for qualitative, quantitative and total diversity respectively, showing Tunisia to contain significant diversity for Daucus in the southern Mediterranean region. Except for D. sahariensis and D. carota subsp. capillifolius, there were no taxon-specific quantitative or qualitative characters, but rather taxonomic differentiation relies on a series of partially overlapping character states (polythetic support). This is reflective of their very designation as subspecies rather than species, by the continuing disagreement of taxonomic boundaries, and by different use of characters by taxonomists to differentiate them. These taxonomic problems are possibly the result of their ability to freely exchange genes in nature and experimentally (above). Our study, using a well-dispersed and comprehensive germplasm collecting in a local area containing much of this diversity (Tunisia), adds to a growing body of evidence documenting the poor differentiation of these taxa.

Two Tunisian accessions we examined (10944 and 10951) possess characters of yet another possible subspecies, D. carota subsp. maximus (Desf.) Ball: inflated, leathery stipule bases, relatively large umbels, and fruits with spines shorter than the diameter of the fruit and stellate at the top (Mezghani et al. 2014). Using morphological data, Spooner et al. (2014) suggested that only two subspecies of D. carota can be distinguished: subsp. carota and subsp. gummifer corresponding to the two species (D. carota and D. gingidium) recognized by Onno (1937) and Pottier Alapetite (1979) or to the two “species aggregates” or “subgroups”, recognized by Small (1978) and Ruduron (2007). On the basis of molecular and morphological data Arbizu et al. (2014a, b) later lowered D. capillifolius to subspecies rank of D. carota. Tavares et al. (2014) attempted to distinguish the subspecies of D. carota native to Portugal and supported subsp. maximus as a distinct taxon from other taxa by morphometric analysis of the fruits and chemical characterization of essential oils, concluding as Saenz de Rivas and Heywood (1974) that subsp. maximus should be considered a species (i.e., D. maximus) rather than a subspecies of D. carota. A broader molecular (genotyping by sequencing) study by Arbizu et al. (2016b) did not support the subspecies of D. carota to form monophyletic lineages, except subsp. capillifolius which is an apospecies occurring within the clade containing subsp. carota; that is, subsp. maximus and subsp. gummifer occurring in different clades containing geographically isolated populations of subsp. carota in Europe (France, Portugal and Italy) and subsp. maximus in Northern Africa (Morocco).

PCA and CDA results unexpectedly placed the putative hybrids between D. carota subsp. capillifolius and D. carota subsp. carota with D. carota subsp. sativus. We have no explanation for this other than knowledge that hybrids often exhibit a mosaic of intermediate, parental, transgressive, and novel characters (Rieseberg and Ellstrand 1993), and the morphological results shown here do not reflect true hybrid origins. Alternatively, our hypothesis of hybridity may be incorrect and perhaps these are true hybrids with subsp. sativus.

The ultimate resolution of the taxonomic boundaries of members of the D. carota clade will require analysis of a broader range of accessions with additional molecular data, as we are pursuing collaboratively with a broader research group.