Keywords

2.1 Taxonomy of the Apiaceae (Umbelliferae)

The Apiaceae (Umbelliferae) family contains 466 genera and 3820 species (Plunkett et al. in press) and is one of the largest families of seed plants. It is nearly cosmopolitan in distribution, but most diverse in temperate regions of the northern hemisphere (Downie et al. 2000a, b, c; Heywood 1983). It is well supported as a monophyletic family, closely related to the families Araliaceae, Pittosporaceae, and Myodocarpaceae, and these, along with three smaller families, constitute the order Apiales, containing about 5400 species (Judd et al. 2016; Plunkett et al. 1996b).

The Apiaceae is well defined morphologically by a suite of characters, typically including herbs with compound leaves, stems usually hollow in the internodes and with secretory canals containing ethereal oils, resins, and other compounds; alternate compound leaves or simple and deeply divided or lobed leaves with sheathing petioles; determinate inflorescences containing simple to compound umbels often subtended by involucral bracts; small flowers with 5 sepals, 5 petals, 5 stamens, and 2 connate carpels with an inferior ovary; 2 small stigmas; with the fruit a schizocarp (dry fruits breaking into one-seeded segments) with each of the two mericarps attached to an entire and deeply divided forked central stalk (carpophone) (Judd et al. 2016).

This large suite of distinctive characters makes the Apiaceae and its constituent species easily recognized to family, but divisions within the family have been the subject of long dispute including circumscription and relationships of the genus Daucus (Constance 1971; Plunkett and Downie 1999) Traditionally, the Apiaceae has been divided into three subfamilies, the Saniculoideae, Hydrocotyloideae, and Apioideae, with the Apioideae, containing the genus Daucus, by far the largest of these three traditional subfamilies. Drude (1898) recognized 8 tribes and 10 subtribes within the Apioideae. Molecular phylogenetic studies have confirmed the monophyly of the subfamily Apioideae but not many of its tribes and subtribes (Downie et al. 2001). Downie et al. (2001) recognized nine tribes in the Apiaceae subfamily Apioideae, and placed Daucus, and 12 other genera, in tribe Scandiceae Spreng., subtribe Daucinae Dumort. (the other 12 genera being Agrocharis Hochst., Ammodaucus Coss. and Durieu, Cuminum L., Laser Borkh. ex P. Gaertn., B. Mey. and Schreb., Laserpitium L., Melanoselinum Hoffm., Monizia Lowe, Orlaya Hoffm., Pachyctenium Maire and Maire and Polemannia Eckl. and Zeyh., Polylophium Boiss., Pseudorlaya (Murb.) Murb., and Thapsia L.).

A genus-level treatment of Daucus by Sáenz Laín (1981) used morphological and anatomical data and recognized 20 species. Rubatzky et al. (1999) later estimated 25 species of Daucus. The phylogenetic relationships among the species of genus Daucus and close relatives in the Apioideae have been clarified by a series of molecular studies using DNA sequences of the plastid genes rbcL and matK; plastid introns rpl16, rps16, rpoC1; nuclear ribosomal DNA internal transcribed spacer (ITS) sequences; and plastid DNA restriction sites (e.g., Arbizu et al. 2014b, 2016a, b; Banasiak et al. 2016; Downie and Katz-Downie 1996; Downie et al. 1996, 1998, 2000a, b, c, 2001, 2010; Katz-Downie et al. 1999; Lee 2002; Lee and Downie 1999, 2000, 2006; Plunkett et al. 1996a; Spalik and Downie 2007; Spalik et al. 2001a, b; Weitzel et al. 2014). Of these DNA markers, the ITS region consisting of ITS1, the intervening spacer, and ITS2 has served as the main marker. A recent study of ITS, and other DNA regions proposed as standard barcodes (psbA-trnH, matK, and rbcL) in 1957 species in 385 diverse genera in the Apiaceae have shown ITS to serve to identify species 73.3% of the time, higher than any of the other individual markers tested (Liu et al. 2014).

A study by Banasiak et al. (2016) using DNA sequences from nuclear ribosomal ITS and three plastid markers (rps16 intron, rpoC1 intron, and rpoB-trnC intergenic spacer) is the latest of a series of studies to investigate ingroup and outgroup relationships of Daucus (Fig. 2.1). This study redefined and expanded the genus Daucus to include the following genera and species into its synonymy: Agrocharis Hochst. (4 species), Melanoselinum Hoffm. (1 species), Monizia Lowe (1 species), Pachyctenium Maire and Pamp. (1 species), Pseudorlaya (Murb.) Murb. (2 species), Rouya Coincy (1 species), Tornabenea Parl. (6 species), Athamanta dellacellae E. A. Durand and Barratte, and Cryptotaenia elegans Webb ex Bolle (these latter two genera with only some of its members transferred to Daucus).

Fig. 2.1
figure 1

Reproduction of the upper part of the Daucus maximum likelihood phylogeny of Banasiak et al. (2016), using combined nuclear internal transcribed spacer region of ribosomal DNA (ITS) and plastid (rps16 intron, rpoC1 intron, and rpoB-trnC intergenic spacer) data, with numbers above the branches representing bootstrap support and posterior probability values. The arrows show hard incongruence between Banasiak et al. (2016) and the nuclear ortholog phylogenies of Arbizu et al. (2014b, 2016b)

Banasiak et al. (2016) made the relevant nomenclatural transfers into Daucus (Table 2.1) and following this classification, the genus Daucus contains ca. 40 species and now includes winged and completely unadorned (“obsolete”) fruits in addition to its traditionally recognized spiny fruits. As summarized in Banasiak et al. (2016) and presented in graphic form in Fig. 5 of this paper, winged versus spiny versus obsolete fruits presented major traditional taxonomic characters at higher levels in the Apiaceae (e.g., Drude 18971898). Winged fruits are considered to be adapted to wind dispersal (Jongejans and Telenius 2001; Theobald 1971), and spiny fruits to animal dispersal (Jury 1982; Spalik et al. 2001a; Williams 1994) and likely under strong selective pressure. The above phylogenetic analyses, however, show these fruit characters to be highly homoplastic and of limited value in delimiting monophyletic groups.

Table 2.1 Taxonomic circumscription of Daucus following Arbizu et al. (2014b, 2016b) and Banasiak et al. (2016), their cladistic relationships, and diploid chromosome numbers

The above classification philosophy followed by Banasiak et al. (2016) in placing all members of a monophyletic clade into a single genus (here Daucus) is not universally accepted, and others may revise the circumscription of these genera. For example, a dissenting classification philosophy of relying solely on molecular data for classification is presented by Stuessy and Hörandl (2014), who recognize a “holophyletic” group as one that includes the immediate ancestor and all its descendants, independent of whatever divergence occurs within each of the derivative lineages (Ashlock 1971). A paraphyletic group, in contrast, is one that derives from a common ancestor but that does not contain all its descendants (Hennig 1966) and is an unacceptable taxon following cladistic conventions. Stuessy and Hörandl (2014) point out that adaptive radiation, common in oceanic islands, produces patterns where new populations continue to accrue reproductive isolation and speciation such that they produce quite distinctive new forms, often recognized as new genera, leaving parental populations intact. As examples in the Daucinae, Stuessy et al. (2014) cite the genus Monizia in the Madeira Islands, but other possibilities could be the genus Tornabenea or the species Cryptotaenia elegans on the Cape Verde Islands or the genus Melanoselinum on the Madeira Islands. Critical data bearing on this classification question rest in the distinctiveness and divergence of these new island forms. Because we have not studied these subsumed genera in detail, we currently take no position on these differences in classification, awaiting additional data and perspectives from others, such as Martínez-Flores (2016) and Plunkett et al. (in press) who maintain more traditional classifications of Daucus.

2.2 Distribution of Daucus

Phylogenetic analysis of ITS sequences supports southern Africa as the ancestral origin of the Apiaceae subfamily Apioideae (Banasiak et al. 2013). Phylogenetic analysis of ITS sequences supports an Old World Northern Hemisphere origin for Daucus, with one or two dispersals to the Southern Hemisphere (Spalik et al. 2010). The center of diversity of Daucus in its traditional sense is in the Mediterranean region (Sáenz Laín 1981). Daucus species also occur elsewhere, with one species (D. glochidiatus) in Australia, four species in the American continent (D. carota, D. montanus, D. montevidensis, D. pusillus Michx.). Following the expanded classification of Daucus by Banasiak et al. (2016), the now included genus Agrocharis extends the range of Daucus into tropical Africa (Townsend 1989).

2.3 New Taxonomic Approaches: Next-Generation Sequencing (NGS)

A major innovation in plant systematics is the development of high-throughput, “next-generation” DNA sequencing (NGS) to infer phylogenetic relationships (Egan et al. 2012; E. M. Lemmon and A. R. Lemmon 2013). NGS typically first involves large-scale sequencing of all components of the genome, with the Illumina platform currently the most commonly used. Some genomes, such as plastid and mitochondria, have much higher coverage than single- to low-copy nuclear DNA and can be factored out of the nuclear genome in NGS data by coverage statistics. The utility of NGS sequencing is markedly improved when a high-quality whole-genome “reference” sequence is available that serves as a heterologous template to guide mapping of sequences of related germplasm. Such whole-genome reference sequences are available in carrot for the plastid genome (Ruhlman et al. 2006) and for the plastid and nuclear genome (Iorizzo et al. 2016). As summarized below, recent phylogenetic studies in Daucus have used high-throughput DNA sequencing to infer phylogenetic relationships at the genus level using orthologous nuclear DNA sequences, also at the genus level using whole plastid DNA sequences, and at the species level using genotyping-by-sequencing (GBS).

2.3.1 Next-Generation DNA Phylogenetic Studies at the Genus Level Using Orthologous Nuclear DNA Sequences

In the past, there has been a paucity of validated nuclear orthologs for phylogenetic studies, and hence, most molecular taxonomic studies have relied heavily on a few plastid and/or ribosomal genes (Small et al. 2004). Phylogenies reconstructed with only one or a few independently inherited loci may result in unresolved or incongruent phylogenies due to data sampling (Graybeal 1998), horizontal gene transfer, or differential selection and lineage sorting at individual loci (Maddison 1995). Following a phylogenetic study by Spooner et al. (2013) where eight nuclear orthologs were used in Daucus but designed without NGS techniques, Arbizu et al. (2014b) identified 94 nuclear orthologs in Daucus, constructed a phylogeny with these, and determined 10 of them to provide essentially the same phylogeny as all 94, paving the way for additional and most cost-effective nuclear ortholog phylogenetic studies in carrot. The 94 (and 10) nuclear ortholog phylogeny was highly resolved, with 100% bootstrap support for most of the external and many of the internal clades. They resolved multiple accessions of many different species as monophyletic with strong support, but failed to support other species. This phylogeny had many points of agreement with Banasiak et al. (2016), including resolving two major clades (Daucus I and II in their study, labeled clade A and B in Arbizu et al. 2014b), with a clade A’ containing all examined 2n = 18 chromosome species (D. carota all subspecies, D. capillifolius, D. syrticus), with the other clade A species being and D. aureus and D. muricatus (as sister taxa), and D. tenuisectus. Two non-Daucus species (Rouya polygama and Pseudorlaya pumila) resolved sister to Daucus clade A’. Clade B (Daucus II in Banasiak et al. 2016) contained six wild Daucus species D. glochidiatus, D. guttatus, D. involucratus, D. littoralis, and D. pusillus, but D. guttatus was not monophyletic within this clade.

2.3.2 An Expansion of the Above Study—The Daucus Guttatus Complex

As mentioned above, the nuclear ortholog study of Arbizu et al. (2014b) resolved a monophyletic group (clade B) of six wild Daucus species D. glochidiatus, D. guttatus, D. involucratus, D. littoralis, and D. pusillus. Some of these species are morphologically similar and difficult to distinguish, causing frequent misidentifications. Arbizu et al. (2016b) used the group of ten nuclear orthologs mentioned above in the study of Arbizu et al. (2014b), and morphological data (Arbizu et al. 2014a), and a greatly expanded subset of accessions of these species, to refine phylogenetic structure of the group. The nuclear ortholog data resolved four well-supported clades (Fig. 2.2), that in concert with morphological data, and nomenclatural data from a study of type specimens (Martínez-Flores et al. 2016) served to identify four phenetically most similar species D. bicolor, D. conchitae, D. guttatus, and D. setulosus. Internested among these four similar species were phenetically more distinctive species D. glochidiatus, D. involucratus, D. littoralis, and D. pusillus. They presented a key to better distinguish all of these eight species. In summary, their research clarified species variation in the D. guttatus complex, resolved interspecific relationships, provided the proper names for the species, and discovered morphological characters allowing proper identification and key construction of members of the D. guttatus complex and related species.

Fig. 2.2
figure 2

Maximum parsimony phylogenetic reconstruction of the Daucus guttatus complex using 10 nuclear orthologs showing resolution of the species in the Daucus guttatus complex. Numbers above branches represent bootstrap values. Clades 1, 2, and 3 were identified in Arbizu et al. (2014b)

2.3.3 Next-Generation DNA Phylogenetic Studies at the Genus Level Using Whole Plastid DNA Sequences

The plastid genome has many features that make it useful for plant phylogenetic studies, including its small size (generally 120–160 kbp), high copy number (as many as 1000 per cell), generally conservative nature (Wolfe et al. 1987), and varying rates of change in different regions of the genome, allowing studies at different phylogenetic levels (Raubeson and Jansen 2005). Hence, earlier sequence-based plant phylogenetic studies used genes or gene regions from the plastid. Relative to the Apioideae, the subfamily of the Apiaceae including Daucus, systematic studies have used plastid restriction site data; DNA sequence data from plastid genes; from plastid introns; from plastid intergenic spacer regions. Using NGS sequencing approaches, Downie and Jansen (2015) sequenced five complete plastid genomes in the Apiales (Apiaceae + Araliaceae): Anthriscus cerefolium (L.) Hoffm., Crithmum maritimum L., Hydrocotyle verticillata Thunb., Petroselinum crispum (Mill.) Fuss, and Tiedemannia filiformis (Walter) Feist and S. R. Downie subsp. greenmanii (Mathias and Constance) Feist and S. R. Downie, and compared the results obtained to previously published plastomes of Daucus carota subsp. sativus and Panax schin-seng T. Nees. They discovered the rpl32-trnL, trnE-trnT, ndhF-rpl32, 5’rps16-trnQ, and trnT-psbD intergenic spacers to be among the most fast-evolving loci, with the trnD-trnY-trnE-trnT combined region presenting the greatest number of potentially informative characters overall that may possess ideal phylogenetic markers in these families.

Spooner et al. (2017) explored the phylogenetic utility of entire plastid DNA sequences in Daucus, using Illumina sequencing, and compared the results with prior phylogenetic results using plastid and nuclear DNA sequences. The phylogenetic tree of the entire data set (Fig. 2.3) was highly resolved, with 100% bootstrap support for most of the external and many of the internal clades. Subsets of the plastid data, such as matK, ndhF, or the putative maximally informative regions of the plastid genome outlined by Downie and Jansen (2015) are only partly successful in Daucus, resulting in polytomies and reduced levels of bootstrap support. Additionally, there are areas of hard incongruence (strongly supported character conflict because of differences in underlying evolutionary histories) with phylogenies using nuclear data (Fig. 2.1).

Fig. 2.3
figure 3

Maximum likelihood cladogram of the entire plastid DNA sequences of Spooner et al. (2017), with the three main clades indicated, with arrows highlighting hard topological incongruence with the nuclear ortholog phylogenies of Arbizu et al. (2014b, 2016b); the two accessions of Daucus syrticus resolve as a sister group to all accessions of D. carota. a Represents expanded topological detail of the upper portion of the entire tree shown on b. The values above the branches are bootstrap support values

Incongruence between plastid and nuclear genes are not uncommon in phylogenetic studies in the Apiaceae (e.g., Lee and Downie 2006; Yi et al. 2015; Zhou et al. 2009), indeed throughout many angiosperms (Wendel and Doyle 1998). These incongruent results showed the value of resequencing data to produce a well-resolved plastid phylogeny of Daucus, and highlighted caution to combine plastid and nuclear data, if at all. The value of generating phylogenies from both nuclear and plastid sequences is that hard incongruence can be quite informative, suggesting such evolutionary processes as “plastid capture” where incongruence can be caused by a history of hybridization between plants with differing plastid and nuclear genomes (Rieseberg and Soltis 1991), and backcrossing to the paternal parent but retaining the plastid genome that is (typically) maternally inherited. Other possible processes that can lead to such incongruence, however, are gene duplication (Page and Charleston 1997), horizontal gene transfer (Doolittle 1999), and incomplete lineage sorting (Pamilo and Nei 1988).

2.3.4 Next-Generation DNA Phylogenetic Studies at the Species Level—Genotyping-by-Sequencing (GBS) for the Daucus Carota Complex

The genus Daucus contains cultivated carrot (Daucus carota L. subsp. sativus Hoffm.), the most important member of Apiaceae in terms of economic importance and nutrition (Rubatzky et al. 1999; Simon 2000), and is considered the second most popular vegetable worldwide after potato (Heywood 2014). Daucus carota has many formally named subspecies and varieties, and the species is widely naturalized in many countries worldwide. The great morphological variation in D. carota has resulted in more than 60 infraspecific taxa, making D. carota the most problematic species group in the Apiaceae (Heywood 1968a, b; Small 1978; Thellung 1926). Cultivated carrots and closely related wild carrots (other subspecies and varieties of D. carota sensu lato) belong to the Daucus carota complex. Its constituent taxa all possess 2n = 18 chromosomes and have weak biological barriers to interbreeding. D. carota undergoes widespread hybridization experimentally and spontaneously with commercial varieties of carrot and the wild subspecies of D. carota (e.g., Ellis et al. 1993; Hauser 2002; Hauser and Bjørn 2001; Krickl 1961; McCollum 1975, 1977; Nothnagel et al. 2000; Rong et al. 2010; Sáenz de Rivas and Heywood 1974; Steinborn et al. 1995; St. Pierre and Bayer 1991; St. Pierre et al. 1990; Umiel et al. 1975; Vivek and Simon 1999; Wijnheijmer et al. 1989). In addition, there are other closely related wild species with 2n = 18 chromosomes (D. sahariensis, D. syrticus) based on shared karyotypes (Iovene et al. 2008), the genus-level phylogenetic studies summarized above, and they represent gene pool 1 species to cultivated carrot. The haploid chromosome number for the genus Daucus (sensu stricto) ranges from n = 8 to n = 11. In addition to the n = 8 diploid species, diploid chromosome numbers in Daucus range from 2n = 16 to 22, and a tetraploid (D. glochidiatus) and a hexaploid (D. montanus) species have been reported (Table 2.1).

To put the taxonomic problem of the Daucus carota complex into historical context, several molecular approaches have examined its diversity and genetic relationships. St. Pierre et al. (1990) used isozymes to study 168 accessions of the D. carota complex from 32 countries and could not separate named subspecies into distinct groups. Nakajima et al. (1998) used random amplified polymorphic DNA (RAPD) and amplified fragment length polymorphism (AFLP) data and showed all accessions of D. carota group into a major clade. Vivek and Simon (1998, 1999) used restriction fragment length polymorphisms (RFLPs) of nuclear, plastid, and mitochondrial DNA and interpreted their results to be generally concordant with the classification proposed by Sáenz Laín (1981), but studied just one additional subspecies (subsp. drepanensis). Using AFLPs, Shim and Jørgensen (2000) showed wild and cultivated carrot clustered separately. Bradeen et al. (2002) used AFLPs and intersimple sequence repeats (ISSR) and concluded wild carrots had no substructure. Rong et al. (2014) obtained a Daucus phylogeny using SNPs and found the subspecies of D. carota to be intermixed with each other. Lee and Park (2014) proposed D. sahariensis, D. syrticus, and D. gracilis to be the likely closest relatives to D. carota. In an attempt to characterize the populations of D. carota present in São Miguel Island (Azores, Portugal), Matias Vaz (2014) used one nuclear ortholog, nuclear ribosomal DNA ITS, and morphological descriptors and concluded that the classification of D. carota remained problematic. Other morphological studies (Arbizu et al. 2014a; Mezghani et al. 2014; Small 1978; Spooner et al. 2014; Tavares et al. 2014) likewise not distinguish the subspecies of D. carota. However, Iorizzo et al. (2013) used 3326 single nucleotide polymorphisms (SNPs) to study the genetic structure and domestication of carrot and found a clear separation between wild (subsp. carota) and cultivated (subsp. sativus) accessions of D. carota.

These taxonomic problems have practical considerations for germplasm curators and taxonomists who have relied on local floras for identifying these taxa such as floras from Algeria (Quézel and Santa 1963), the Azores (Schäfer 2005), Europe (Heywood 1968b), the Iberian Peninsula and Balearic Islands (Pujadas Salvà 2003), Libya (Jafri and El-Gadi 1985), Morocco (Jury 2002), Palestine (Zohary 1972), Portugal (Franco 1971), Syria (Mouterde 1966), Tunisia (Le Floc’h et al. 2010; Pottier-Alapetite 1979), and Turkey and the East Aegean Islands (Cullen 1972). Unfortunately, the keys and descriptions in these floras lack consensus about both the number of infraspecific taxa and characters best distinguishing them. For instance, 11 wild subspecies were recognized by Heywood (1968a, b), five by Sáenz Laín (1981: subsp. carota, subsp. gummifer, subsp. hispanicus, subsp. maritimus, and subsp. maximus), five by Arenas and García-Martin (1993), and Pujadas Salvà (2002) proposed nine subspecies for the Iberian Peninsula plus Balearic Islands (subsp. carota, subsp. cantabricus, subsp. commutatus, subsp. gummifer, subsp. halophilus, subsp. hispanicus, subsp. majoricus, subsp. maximus, and subsp. sativus).

Molecular investigations are trying to resolve the natural taxa in D. carota. “Reduced-representation” methods obtain partial DNA polymorphisms throughout the genome and have been shown to be very useful at the species level. Genotyping-by-sequencing (GBS) is one such reduced-representation method that generates sequence variants or single nucleotide polymorphisms (SNPs) (Elshire et al. 2011). GBS provides a powerful and cost-effective molecular approach for phylogeny reconstruction, producing abundant large-scale genomic data to infer phylogenetic relationships among recently diverged species or populations (e.g., Balfourier et al. 2007; Escudero et al. 2014; Good 2011; Wong et al. 2015). It captures both neutral genetic diversity and loci that affect quantitative traits of interest, because of the full-genome coverage of the GBS markers. It shows little to no ascertainment bias because markers are developed directly on the population being genotyped. Genetic relatedness among genotypes calculated using GBS markers is based on patterns of neutral and functional genetic variation across the genome.

Arbizu et al. (2016a) used GBS to examine the subspecies of D. carota. They obtained SNPs covering all nine D. carota chromosomes from 162 accessions of Daucus and related genera. They scored a total of 10,814 or 38,920 SNPs with a maximum of 10 or 30% missing data, respectively. Consistent with prior results, the phylogenetic tree separated species with 2n = 18 chromosome from all other species in a single clade. Most interestingly, there was a strong geographic component to this phylogeny, with the wild members of D. carota from central Asia in a clade with eastern members of subsp. sativus. The other subspecies of D. carota were in four clades associated with geographic groups, suggesting that the subspecies are not natural groups. In summary, the wide range of morphological and molecular studies summarized above documents poor substructure of either morphologically or phylogenetically stable groups in D. carota. These results were concordant with results from recent morphological studies that led Spooner et al. (2014) to question whether many wild subspecies recognized within D. carota are valid taxa.

2.4 Conclusions

In summary, the taxonomy of Daucus at both the genus and species levels has been improved markedly in the last years by a series of morphological and molecular studies. Earlier studies using limited sets of plastid and nuclear markers have shown nuclear ribosomal ITS to be the most useful marker. Next-generation sequencing techniques are corroborating many of these studies, but adding details, especially cautioning combining nuclear and plastid data in combined data approaches. The phylogenetic study of Banasiak et al. (2016) has clarified ingroup and outgroup relationships and has resulted in an expanded concept of the genus. Continuing studies at the species and genus levels with NGS data and with additional collections are helping to refine our understanding of Daucus and should eventually lead to a much needed formal taxonomic revision taking into account phylogeny, keys, descriptions, illustrations, typifications, distributions, and maps.