5.1 Potential of Genome Information in the Wild Relatives of Persimmon

Recent progress in biotechnology, especially molecular biological techniques, has allowed various genetic and genomic approaches in non-model plant species, including tree crops. In tree crops, long juvenile phases generally prevent rapid selection of the bred lines, to which the marker-assisted selection (MAS) based on their genomic information has been proposed as a good solution since the 1990s (Ribaut and Hoisington 1998; Collard and Mackill 2008). In contrast to mostly diploid lineages, such as tree crops in the genus Citrus or the family Rosaceae, polyploid species have been thought to be hard to be studied with genetic or genomic approaches, due to its nature of genetic complexity and difficulty in genome sequencing. Persimmon, as a highly polyploid species (2n = 6X or 9X = 90 or 135), would also be an annoying target of genetic/genomic analysis. Notwithstanding, previous studies have managed to develop genetic markers mainly for the understanding of cultivar/variety differentiations (Luo et al. 1995; Badenes et al. 2003; Yonemori et al. 2008; Parfitt et al. 2015) or selection of specific traits, such as fruit astringency derived from proanthocyanidin accumulation (Kanzaki et al. 2001, 2010). While, in hexaploid species, it would be still quite hard to construct a genome-wide sequence database, at least with the current sequencing technology (although further next-generation techniques may easily enable that, considering the progress in these 10 years). For genome-wide analysis in hexaploid persimmon, one feasible approach would be the utilization of genomes in diploid close relatives. This concept would not be limited to persimmon, but has been acceptable to other polyploid crop species, such as strawberry (Shulaev et al. 2011) or sweet potato (Wu et al. 2018). Previous phylogenetic analyzes suggested that diploid Diospyros lotus and Diospyros oleifera would be candidates for the alternative whole-genome sequences resource (Yonemori et al. 2008). The use of D. lotus genome information indeed successfully resulted in the identification of genetic markers linked to the sex determination locus (Akagi et al. 2014) or of the region including the ASTRINGENCY locus (Nishiyama et al. 2018). Although we should be careful about the phenotypic differences in these wild relatives and persimmon (or D. kaki), alternative use of their genomes would often allow various objectives involving genetic research or actual molecular breeding in hexaploid persimmon.

Another importance of the genome information of diploid Diospyros species would be understanding the evolutionary scenario for persimmon. Recent draft genome sequencing in diversified crops has unveiled an “unexpected” (and “undesirable,” as well) possibility for the conservation of gene functions amongst the lineages. In contrast to animal taxa, plants have undergone frequent whole-genome duplication events (WGD) in lineage-specific manner (Van de Peer et al. 2017), which are thought to have provided opportunities for the appearance of new traits representing each species. For instance, in horticultural crops, functional differentiation between paralogs, which had been derived from WGD, resulted in the establishment of specific ripening characteristics in tomato fruits (The Tomato Genome Consortium 2012), specific oil composition in olive (Unver et al. 2017), and specific sex-determination system in kiwifruit (Huang et al. 2013; Akagi et al. 2018, 2019). These are also consistent with an evolutionary theoretical framework that the gene redundancy provided by the presence of duplicate copies allows one copy to be neofunctionalized without loss of the original function (Flagel and Wendel, 2009). In other words, proper (or reliable) draft genome sequences of each crop or its relatives would be indispensable for understanding their lineage-specific traits, which would often involve their commercial importance.

5.2 Draft Genome Sequences in D. Lotus

So far, a few of the genome sequences were drafted in diploid Diospyros, such as D. oleifera (introduced in Chap. 6; Zhu et al. 2019; Suo et al. 2020) or D. lotus. Here, I introduce the characteristics and evolution of D. lotus genome, often associating the sex determination systems (as related to Chaps. 9 and 10). The current version (on 31st Aug 2021) of the D. lotus cv. Kunsenshi-male (male, 2A + XY) draft genome (http://persimmon.kazusa.or.jp/), of which the expected haploid genome size is 907 Mb from flow cytometry (Tamura et al. 1998) or 877.7 Mb from kmer analysis in PacBio long-read data, are constituted of 3073 primary contigs totaling 746.1 Mb, and 5901 “secondary” contigs, which are putative allelic contigs to the primary contigs. With the segregating F1 populations (N = 314 and 119, Akagi et al. 2013), a total of 5,959 markers derived from GBS/ddRAD sequencing allowed anchoring of the ca 61.8% scaffolds into 15 pseudomolecules, which is consistent with the basic chromosome numbers in the Diospyros (Akagi et al. 2020). The draft genome (or primary contigs) includes 40,532 predicted gene locations, of which the numbers would be comparable to those in other asterid plant species, such as tomato (N = 34,879) (The Tomato Genome Consortium 2012) or kiwifruit (N = 39,040) (Huang et al. 2013).

Potential whole genome-wide duplications would be one of the important characteristics which would be detectable from mainly two indexes; (i) long syntenic relationships within the genomic regions, and (ii) distribution of silent divergence [or four-fold synonymous (degenerative) third-codon transversion (4DTv), to be more strict] in the duplicated paralogous gene pairs. Previous characterization of the genetic diversities and the distribution of silent divergence (Ks or dS) values in limited numbers of the genes in the genera Actinidia, Diospyros, and Camellia suggested lineage-specific genome-wide duplications in the genus Actinidia and Diospyros (Shi et al. 2010). Consistent with this assumption, in the D. lotus draft genome, the described two indexes suggested at least two genome-wide duplication events, which would correspond to the hexaploidization γ (Hex-γ) (Jaillon et al. 2007) common in the eudicot taxa, and a novel one, named Dd-α (Akagi et al. 2020) (Fig. 5.1). This result was also supported by comparative genomics in the genome drafting of tea (Camellia sinensis), which is nested into the order Ericale, as well as the genus Diospyros (Wang et al. 2021). While, the timing of the later lineage-specific genome-wide duplication (or Dd-α) might remain to be examined more because the species divergences in the ancestral state of the order Ericale is still ambiguously defined (Akagi et al. 2020; Wang et al. 2021). Importantly, the Dd-α occurred concurrently with previously reported whole-genome duplication events in the asterids (Huang et al. 2013; Iorizzo et al. 2016; Reyes-Chin-Wo et al. 2017), as well as across the angiosperms (Vanneste et al. 2014; Van de Peer et al. 2017), concentrated around the K-Pg (Cretaceous-Paleogene) boundary, which is hypothetically 60–70 million years ago (mya) (Fig. 5.2).

Fig. 5.1
It depicts a graph of distribution of silent divergences in the genome-wide. The X-axis depicts the silent divergence (ds), while the Y-axis depicts the % of gene pairs.

Distribution of silent divergences in the genome-wide homologous gene pairs (this panel is cited from Akagi et al. 2020 PLoS Genet 16(2): e1008566). The peaks correspond putative genome-wide (or very large) duplication events, in Vitis (purple), Actinidia (green), Diospyros (orange), and Solanum (red) genomes. The silent divergence would be a proxy age. For the Diospyros genome, the most representative peak, given as the asterisk, was detected almost concurrently with the Solanum-specific (tomato-specific) hexaploidization, named T-tri, which was supposed to occur ca 60–70 mya. The Hex-γ is thought to be a common event to the (eu)dicot species (Jaillon et al. 2007) including the described four species, while recent paleo-ploidizations would tend to mask the genomic traces of Hex-γ

Fig. 5.2
It depicts a flow chart of angiosperms. It contains Actinidia, Carmella, Diospyros, Solanum, lactuca, Daucus, Vitis, Arabidopsis, Carica, Gossypium, Populus, Glycine, Phoenix, Asparagus.

Lineage-specific whole-genome duplication events in representative angiosperm. Putative whole genome-wide duplication events, given in open circles, and their time scales are referred from previous reports (Van de Peer 2009; Huang et al. 2013; Vanneste et al. 2014; Iorizzo et al. 2016; Reyes-Chin-Wo et al. 2017; Harkess et al. 2017; Van de Peer et al. 2017; Akagi et al. 2020). K-Pg, Cretaceous-Paleogene boundary

5.3 New Functions Established via the Lineage-Specific Genome-Wide Duplications

Paleo-genome duplication events, as described, would provide good chances to establish new gene functions, as functional redundancy of the duplicates can facilitate the plasticity, often resulting in subfunctionalization, neofunctionalization, or pseudogenization (Flagel and Wendel, 2009; Van de Peer et al. 2017). Neofunctionalization in expression pattern (or cis-evolution) would be quite simple to define if either of the duplicated pair exhibits novel expression behaviors. The evolution in cis-regulatory elements is thought to be a more rapid and fundamental reaction after duplication events, than that in trans-acting elements (Lynch and Conery 2000; Roulin et al 2013). Consistent with this theoretical framework, in D. lotus genome, the expression patterns of the duplicated pairs derived from the Dd-α showed substantially differentiated, even in focusing only on the expressions between male and female flower buds (Akagi et al. 2020). On the other hand, it might be hard to directly define neofunctionalization in trans-acting elements (or protein function) via duplication events. A potential index to call a novel beneficial function (for surviving) that was triggered by duplication, would be the transition of evolutionary rate (dN/dS values). New beneficial amino-acid mutations are often under positive selections (dN/dS ≪ 1.0), then followed by strong purifying selections (dN/dS ≪ 1.0) to be genetically fixed. Application of the model considering this situation, called “episodic positive selection,” to the D. lotus genome found some candidate genes that acquired a novel function immediately after the Dd-α whole genome-wide duplication event (Fig. 5.3). It is worthy to note that a sex-determining gene, MeGI (see Chap. 8 for sex-determination system in persimmon) derived from its paralogous gene, named Sister of MeGI (SiMeGI), was included in the list of the putatively neofunctionalized genes that underwent episodic positive selection after the Dd-α (Akagi et al. 2020). Importantly, this assumption has been experimentally validated with the transformation of Nicotiana tabacum, where constitutive induction of MeGI resulted in repressed androecium development, or feminization. This feminizing function is not conserved in the paralogous genes, SiMeGI, and its orthologous gene in the other plant species, such as barley (Komatsuda et al. 2007; Sakuma et al. 2013), Arabidopsis (González-Grandío et al. 2017), and maize (Whipple et al. 2011), and also tomato (Lin et al. 2008). Hence, detection of the adaptive evolution of a sex-determining gene, MeGI, by using evolutionary indexes would be a good example for exploitation of the whole-genome information to figure out lineage-specific traits or gene functions. Still, there would be massive untouched gene resources that possibly have evolved Diospyros-specific new functions via lineage-specific duplication events. Considering the potential of this frontier, rapid gene evaluation systems, such as with precocious flowering lines as in the case of kiwifruit nested into the same order Ericale (Varkonyi-Gasic et al. 2019), would be the next breakthrough.

Fig. 5.3
It depicts an evolutionary model which consists of Paralog-1, Paralog-2.

Evolutionary model to detect episodic positive selections driven imeediately after the Dd-α. We targeted strong positive selection on either of the duplicates derived from the Dd-α (ω1 ≫ 1, on the thick pink branch), which was not undergone in the background branches and the counter part paralog (ω0 < 1). After establishment of an adaptive new function, the evolutionary rate should be reduced to be under purifying selection, with distinct selective patterns from the original orthologs (ω2 < 1, ω2ω0, green triangle)

5.4 Evolution of the Sex Chromosome

The first finding of the sex chromosome in flowering plants was made independently in white champion (Silene latifolia, and Silene spp.), sorrel (Rumex acetosa), hop (Humulus lupulus), and so on, in 1923 (Kihara and Ono 1923; Winge 1923; Blackburn 1923). After that, sex chromosomes have been researched over a century, while yet their nucleotide sequence contexts and evolution of the male(or female in ZW system)-specific regions have been little assessed (Renner and Müller 2021). The genus Diospyros is mostly dioecious, except potentially polygamous polyploids or minorities, which are determined by an XY (or heterogametic male) system (see Chaps. 8 and 9). So far, no reports suggested heteromorphic sex (or XY) chromosomes in Diospyros, and thus, their male-specific region of the Y-chromosome (MSY) is thought to be very small (Akagi et al. 2014). The draft genome of D. lotus successfully anchored some pseudo-autosomal sex-linked scaffolds to chromosome 15 (Akagi et al. 2020). On the other hand, the MSY, including OGI, has not been anchored using genetic markers, presumably due to the large structural variation between the X and Y chromosome and the resultant Y-specific hemizygosity surrounding the OGI gene. Instead, with bacterial artificial chromosomes (BAC) walking started from the seed of OGI (Akagi et al. 2014), Y-chromosomal supercontigs covering most of the MSY were physically anchored (Akagi et al. 2020). The regions surrounding the OGI are male-specific (or hemizygous) and hyper-repetitive, often including palindrome-like structures. These sequence contexts are consistent with those of sex chromosomes in animal taxa (Bachtrog et al. 2014). In the outer regions of the MSY, putative pseudo-autosomal regions (PAR), which include both X- and Y-allelic genes, are partially scattered into the MSY, and dominantly appear only 200–300 kb apart from the single-sex determinant, OGI. Considering the relatively long history of the Y chromosome (or the sex determinant; ca 20–30 million years estimated from the putative silent divergence between MeGI and OGI) in Diospyros, such small MSY (up to 600–1000 kb) might be inconsistent with the conventional hypothesis for sex chromosome evolution in plants, where distinct MSY can form in only a few million years (Ming et al. 2011). On the other hand, recent characterization of various plant sex chromosomes is uncovering their evolutionary diversities, providing some examples for small MSY, such as in Populus, Vitis, and Actinidia, which would be consistent with the sex chromosome evolution in Diospyros (Renner and Müller 2021).

In the MSY of D. lotus, although the boundary to the PAR would be quite ambiguous, some genes were predicted in the hemizygous Y-allelic regions, while they are thought to be mostly the components of transposable elements, which is consistent with the MSY evolution in animal taxa. In the PAR-like sequences, which still include male-specific hemizygous islands, the silent divergence rate between the putative X and Y alleles of the genes is decreased, in inverse proportion to the distance from OGI. This situation is expectable, which reflects historical recombination between the perfectly conserved Y factor, OGI, and the surrounding regions, often forming evolutionary strata (Bergero et al. 2007). The actual dS rate in 1-Mb from OGI, ranges from approx. 0.2 (at OGI) to 0.07, which is comparable to the interspecific dS rate between D. lotus and distant Diospyros species (such as in D. mespiliformis, dS = ca 0.07 against D. lotus), implying suppressed recombination in the regions flanking OGI before the divergence of some Diospyros species. Although OGI is thought to arise via duplications from the ancestral MeGI (Akagi et al. 2014), long syntenic collinearity among the regions surrounding them is not conserved. This means that a local segmental duplication (or simple gene duplication) event followed by an inversion, triggered the establishment of the dioecious sex-determination system of this genus.

5.5 D. Lotus Genome Shed Light on the Insights into the Transition into Dioecy Associated with Duplication Events

Summarizing so far, the D. lotus genome information would raise the following hypothesis for the transition into dioecy in Diospyros. A lineage-specific WGD event, Dd-α contributed to derive the first twin, proto-MeGI, and SiMeGI, and the proto-MeGI specifically underwent positive selection to be neofunctionalized to act as a feminizing factor to dominantly suppress androecium development. This was followed by a segmental duplication to derive MeGI and a Y-encoded OGI, which dominantly suppresses the expression of MeGI (Akagi et al. 2014, 2020). It would be worthy to note that this type of evolutionary pattern would be reminiscent of potential generality or commonality for the establishment of dioecy in other plant species. In the establishment of dioecy in garden asparagus, the Y-encoded sex-determining gene, SOFF, is thought to have originated from an Asparagus-specific segmental or whole genome-wide duplication, which was followed by the acquisition of its function as a dominant suppressor of feminization (SuF) (Harkess et al. 2017). One of the two Y-encoded sex determinants in kiwifruit (Actinidia spp.), Shy Girl, which is also a dominant suppressor of feminization (Akagi et al. 2018, Varkonyi-Gasic et al. 2021), arose via an Actinidia-specific duplication event (Akagi et al. 2018, 2019), potentially involving one of the two Actinidia-specific WGD events, Ad-α (Huang et al. 2013). Furthermore, in the family Salicaceae, multiple Y-encoded sex determinants, which are mostly small-RNAs acting as dominant suppressors of the “master sex regulator” of ARR17, are independently derived from lineage-specific duplication events (Müller et al. 2020). These parallel paths towards the independent evolution of these sex determinants are probably not coincidental, but consistent with the theoretical framework described above. In flowering plants, transition into separated sexuality would require the appearance and selection of a gain-of-function event in order to acquire a dominant suppressor(s) (Charlesworth and Charlesworth 1978a, 1978b). Although a dominant suppressor would be hard to generate from a non-redundant (or single) gene due to the importance of its original function, while whole-genome duplication events provide good opportunities to relax purifying selective pressure for abundant genes to be neofunctionalized into dominant suppressors.

5.6 Application of the D. Lotus Genome Information and Future Prospects

One of the simplest usages of the D. lotus genome is as a reference for transcripts or genome mapping from D. kaki. As in many cases of the transcriptomic or whole-genomic assessments in polyploid species, alternative use of a diploid close relative for mapping would be an effective way to reduce the issues derived from their genomic/genetic complexities. Although the D. lotus genome information has not been opened until recently, some genome-wide studies already utilized that for the database in D. kaki. Transcriptomic analyses in D. kaki focusing on unlocked male flower production (Masuda et al. 2020a), fruit shape diversity among the cultivars (Maeda et al. 2019), or dwarfism involving gibberellin metabolism (Dong et al. 2021), applied the D. lotus whole-gene sets, and successfully found the candidate genes or molecular mechanisms. Considering the conservation of the genes and their homologies between D. lotus and D. kaki, transcriptomic approaches should have no serious issues. As a genomic reference map, Masuda et al. (2020b) mapped captured gDNA Illumina reads (ddRAD-Seq data, more specifically) from D. kaki segregated population to the D. lotus pseudomolecule and scaffold contigs, to identify genetic regions contributing to the bias in the female/male flower ratio. Importantly, the Y-chromosome (or the Y-encoded sex determinant, OGI) itself and/or its allelic dosage had a substantial association with the male flower ratio. The D. lotus genome sequences were derived from a male (2A + XY), and the Y-specific regions have been well examined as described, so that it would be especially suitable for the studies involving the persimmon sexualities. Epigenetic studies in D. kaki are also acceptable with the D. lotus genome as a reference. Fluctuation of the genome-wide DNA methylation levels, which is potentially associated with male flower production in a genetically female cultivar, were successfully characterized with D. lotus genome as the reference (Masuda et al. 2020c). On the other hand, if we target more “specific” sequences, such as short cis-regulatory elements, in D. kaki, the D. lotus genome might not be applicable, considering the evolutionary distance between them. The genome sequences of D. oleifera, which showed a little bit closer relationship to D. kaki than D. lotus (Yonemori et al. 2008), would be an alternative option, as described in Chap. 6. Still, we should be careful about the hexaploidy in D. kaki, which exhibited substantial sequence variations among the hexaplex alleles. Also, the evolutionary distances (or sequences homologies) between two species would be very flexible among the genomic regions, especially in tree crops, where it would be hard to define uniform distance per genome, mainly due to frequent introgressions (Cornille et al. 2012; Numaguchi et al. 2020). Case-by-case selection of a reference genome, depending on the research purposes, would be key. With the current technologies, construction of a draft genome with nearly perfect quality, such as chromosome-scale assembly, would be becoming quite easier for very cheaper costs. We suppose that it would be reasonable not to persist with the limited genomes information, but to explore the frontiers of other Diospyros genomes for various objectives. Comparative genomic or population genetic approaches with diversified Diospyros species would shed light on more evolutionary aspects, such as the domestication path of D. kaki.