Introduction

Common buckwheat (Fagopyrum esculentum Moench; 2n = 2x = 16) is an annual outcrossing crop that is considered to be a pseudo-cereal, because the culinary use of its seeds (strictly speaking, achenes) is the same as that of other cereals such as rice (Oryza sativa) and wheat (Triticum aestivum). In 2017, global buckwheat production totaled approximately 3,827,000 tons and the total harvested area was approximately 3,940,000 ha (FAOSTAT; http://www.fao.org/faostat/en/#data/QC/visualize). Buckwheat has a short growing period and is tolerant of cool climates and high elevations. Therefore, buckwheat has been widely cultivated in temperate zones of Eurasia and is used in many traditional foods, such as soba (Japanese noodles), memil guksu (Korean noodles), buckwheat kasha (Russian buckwheat porridge), pizzoccheri (Italian pasta), and galette (French pancake).

Common buckwheat grains contain abundant starch, as well as high-quality protein with a well-balanced amino acid profile (Bonafaccia et al. 2003; Suzuki et al. 2020). Buckwheat is also considered to be a functional food because of its abundance of bioactive compounds that have antioxidative, antihypertensive, and anti-obesity properties (Kreft et al. 2020; Matsui and Walker 2020). These health-promoting properties, along with a lack of gluten, have led to a wider cultivation of buckwheat outside Eurasia, in places such as the USA, Canada, Australia, and New Zealand. However, buckwheat yield is low and variable for a variety of reasons, such as low resistance to excess water, lodging, and pre-harvest sprouting (Morishita et al. 2020). The grains also contain allergenic proteins that cause immunoglobulin E-mediated allergic reactions in some people (Park et al. 2000; Satoh et al. 2020). Therefore, buckwheat is cultivated as a minor crop around the world.

Because buckwheat has many beneficial qualities, breeders have been interested in developing new varieties with improved yield and fewer undesirable traits (Ohsawa 2020). In particular, buckwheat’s outcrossing trait, enforced by a heteromorphic self-incompatibility (SI) system, has made it difficult to establish fixed varieties and carry out efficient genetic analyses. Under these challenging conditions, buckwheat breeding and genetic studies have been performed for decades.

Based on our studies on the heteromorphic SI system of buckwheat, we have developed self-compatible (SC) common buckwheat lines. SC lines have many beneficial characteristics for buckwheat breeding and cultivation: for example, (1) segregating lines can be easily produced by self-pollination for gene analysis; (2) the morphology and genotypes can be easily fixed; and (3) seeds can be produced without insects, resulting in stable yields that are less affected by the environment (Matsui et al. 2007; Matsui and Yasui 2020). We have also developed a set of molecular markers and a genome database to facilitate the genetics and breeding of buckwheat and have identified some agronomically important genes using molecular technology, including next-generation sequencing (NGS). The development of plant materials, especially SC lines, and technologies, such as sequencing and genomic analysis, has greatly accelerated buckwheat breeding. In this review, we discuss these and other recent advances in buckwheat improvement and consider the future of buckwheat breeding given the recently available technologies and information.

Development of self-compatible common buckwheat lines

Heterostylous self-incompatibility and the development of self-compatible lines

Buckwheat heterostyly has been reviewed elsewhere (Ueno et al. 2016; Matsui and Yasui 2020); here, we focus on the use of self-incompatibility and self-compatibility for buckwheat breeding. Buckwheat is strictly outcrossing due to heterostylous SI and has two flower types, pin and thrum (Fig. 1a, b). Pin flowers have a long style and short anthers, whereas thrum flowers have a short style and long anthers (Darwin 1877). The flower morphology and self-incompatibility are governed by a single, complex S locus, called the S supergene. The S locus of buckwheat has two alleles, S and s; thrum plants are Ss and pin plants are ss (reviewed in Matsui and Yasui 2020). As buckwheat depends on pollinators such as bees and flies for crossing, yields are strongly influenced by environmental factors that affect insect activity.

Fig. 1
figure 1

Buckwheat flowers. a Pin; b thrum; c long homostyle; d short homostyle

Self-pollinating buckwheat lines that can produce their own pistils and stamens have been developed from variants isolated from natural populations or by interspecific crosses between F. esculentum and F. homotropicum (Fig. 1c, d) (Marshall 1969; Campbell 1995; Aii et al. 1998; Woo et al. 1999; Matsui et al. 2003b; Wang et al. 2005b). Marshall (1970) named and registered one self-pollinating line as “Pennline 10.” The self-pollination of “Pennline 10” is governed not by the S locus but by at least three other genetic loci (Marshall 1969; Matsui et al. 2004b). Matsui et al. (2004b) suggested that the seed production via a plant’s own pistils and stamens was caused by a weakness of the SI reaction. These genes have not yet been identified, and isolating them would be an important step toward increasing the percentage of self-pollinated seeds and enabling heterosis breeding (Matsui et al. 2007; Mukasa et al. 2010).

In 1992, the self-compatible (SC) wild species Fagopyrum homotropicum was first identified by Ohnishi (1998) and suggested the possibility of introducing SC genes into common buckwheat, F. esculentum. A SC line developed from an interspecific cross between F. esculentum and F. homotropicum was first reported by Campbell et al. (1995). Subsequently, a series of interspecific crosses performed by Woo et al. (1999) revealed that the SC property is conferred by the Sh allele at the S locus of F. homotropicum, and that Sh is dominant over the s allele but recessive to the S allele. Matsui et al. (2003b) pollinated a SC line with pollen from pin or thrum plants and noted that thrum pollen tubes grew in the style and reached the ovules of the SC line, but that pin pollen tubes stopped in the middle of the style without reaching an ovule.

The results of additional genetic analysis using the SC line supported the hypothesis that the S locus is a supergene complex that includes five genes (Dowrick 1956; Sharma and Boyes 1961): style length (G for short style, g for long style), style incompatibility (IS for style incompatibility of short style, is for style incompatibility of long style), pollen incompatibility (IP for pollen incompatibility of short anther, ip for pollen incompatibility of long anther), pollen size (P for large pollen grain, p for small pollen grain), and anther height (A for long anther, a for short anther). Wang et al. (2005a) reported another locus controlling self-compatibility that was derived from the interspecific cross between F. esculentum and F. homotropicum. Further research is needed to determine if the locus identified by Wang et al. (2005a) is related to the locus derived from “Pennline 10” (Marshall 1969; Matsui et al. 2004b).

SC lines developed by interspecific crosses between F. esculentum and F. homotropicum often display a shattering habit that is derived from F. homotropicum. Shattering is conditioned by at least two different seed pedicel traits: (1) weakness and (2) brittleness caused by the formation of an abscission zone and dissociation at the zone (Matsui et al. 2003a). A genetic analysis clarified that brittle pedicel in buckwheat is controlled by at least two loci, Sht1 and Sht2, and that Sht1 is linked to the S locus (Matsui et al. 2003a; Wang et al. 2005b). A non-brittle pedicel phenotype results when plants are homozygous recessive at both loci or homozygous at just one of them. Because all cultivated buckwheat lines are homozygous for the recessive sht1 allele (Matsui et al. 2003a), a SC line that was also homozygous for sht1 was selected and registered as “Norin-PL1” (formerly known as “Kyushu PL4” in Japan; Matsui et al. 2008b). Although this line can be used to introduce SC into cultivated elite varieties, the descendent lines show lower yields than the parental elite variety (Matsui et al. 2008b).

Breeding by self-pollination often causes a reduction in yield and plant growth, a phenomenon known as inbreeding depression. Although inbreeding depression is thought to be caused by homozygosity at specific loci, the loci and underlying mechanisms have not been clarified in common buckwheat. To overcome this inbreeding depression, “Norin-PL1” and elite varieties must continue to be crossed. To increase the efficiency of these crossing experiments, codominant markers distinguishing Sh and s alleles were previously developed for fixing Sh alleles in the F2 generation (see the later “Molecular markers” section) (Matsui et al. 2020). Though “Norin-PL1” is not cultivated by farmers, this line has been useful in a variety of genetic and breeding studies (e.g., Hara et al. 2011, 2020a for identifying genes controlling photoperiod sensitivity; Yasui et al. 2012 for identifying heteromorphic SI-related genes; Matsui et al. 2013 for genetic analysis of early leaf senescence; Katsu et al. 2017 for flavonoid biosynthesis; Hara et al. 2020b for pre-harvest sprouting). Furthermore, several SC lines other than “Norin-PL1” have been developed and used for studies of buckwheat cultivation and breeding (e.g., Mukasa et al. 2006; Kasajima et al. 2017), indicating the importance of establishing SC lines of common buckwheat.

Molecular analysis of genes at the S locus

Although we have developed SC lines, it is important to decipher the genomic structure of the S locus and to clarify the function of each of its genes in order to develop molecular markers and to control SC/SI of buckwheat plants for efficient breeding and production. The S locus of buckwheat is a complex that includes at least two genes, none of which have been isolated. Yasui et al. (2012) identified a candidate gene, S LOCUS EARLY FLOWERING3 (S-ELF3), for the style length gene (G) and/or style incompatibility gene (IS) of the S locus. Interestingly, the S-ELF3 gene exists only in the S allele of the S locus (Yasui et al. 2012, 2016). This may be why the S locus is diallelic in buckwheat, unlike other SI plant species, such as in the Brassicaceae and Rosaceae, which have multiple alleles at the S locus (Takayama and Isogai 2005). However, the role of S-ELF3 is unclear; it is not known whether the gene controls the length or the SI phenotype of the style. Methods for mutagenesis and stable transformation of buckwheat will be needed to answer this question (see also the later “Demonstration of gene function” section).

Takeshima et al. (2019) identified a polygalacturonase, designated FePG1, that is related to style length in buckwheat flowers. Accumulation of this enzyme and expression of its gene were observed only in the styles of thrum plants. Intriguingly, the gene is expressed in the short style of “Pennline 10,” even though it is homozygous for s at the S locus (Matsui et al. 2004b). The FePG1 gene is not part of the S locus but functions downstream as part of the SI reaction. The researchers also hypothesized that FePG1 accumulation is specifically regulated by S-ELF3.

Useful and reliable methods to obtain F1 seeds

SC lines facilitate genetic analyses because they allow the production of segregating lines such as F2 populations and recombinant inbred lines (RILs). Crossing is the first step in the production of segregating lines. Crosses between SI and SC plants are relatively straightforward, because the SI plants cannot produce self-pollinated seeds. By contrast, crosses among SC plants are challenging, because SC plants produce self-pollinated seeds that cannot be distinguished from crossed seeds, and physical emasculation with tweezers is extremely challenging, due to the small size of buckwheat flowers and the difficulty in determining the proper timing for the procedure within a closed bud.

Mukasa et al. (2006) developed a method to emasculate buckwheat flowers with hot water: immersion in 42 °C water for 5 min. No self-pollinated seeds were produced after this treatment. Although it is labor-intensive to prepare many flowers in this way, it ensures that F1 seeds will be obtained. Nevertheless, it is possible that some of the seeds obtained will be self-pollinated.

Morphological markers are useful tools to test if F1 plants are the result of the intended cross. For example, when a plant homozygous for a recessive morphological trait is used for the maternal parent and a plant homozygous for a dominant allele for the same trait is used for the pollen parent, F1 plants should show the morphological trait conferred by the dominant allele inherited from the pollen parent. Plants produced by the unintended cross (i.e., self-pollination) should show the morphology conferred by the recessive alleles inherited from the maternal parent.

Matsui et al. (2008a) reported a spontaneous mutant with green stems, a phenotype that can be observed in seedlings and which is controlled by a single recessive gene locus. As most buckwheat cultivars and self-compatible lines we have developed have red stems, a self-compatible line with green stems would be very useful for verifying the success of crosses (Matsui et al. 2018b). Dwarfing is another morphological marker that could be used to test crosses. Ohnishi and Nagakubo (1982) reported six single recessive loci that control dwarfing in buckwheat, and Ohnishi and Ohta (1987) constructed a linkage map of loci related to dwarfing, including these six loci. However, because dwarfing has pleiotropic effects on many agricultural traits, it is necessary to consider whether segregating lines having a dwarf trait would still be useful for genetic analysis.

Molecular marker analysis is the most reliable method to determine if the intended crosses have been performed. However, this type of analysis depends on the availability of markers that can detect polymorphisms between lines. We describe the development of such molecular markers in the next sections.

Development of molecular markers and construction of genetic and physical maps

Molecular markers

Molecular markers linked to target traits, such as crop quality and disease resistance, allow desirable progeny to be quickly and accurately identified at the seedling stage without having to phenotype each plant. Several types of molecular markers have been developed and used in buckwheat (Yasui 2020).

In the late 1990s and early 2000s, polymerase chain reaction (PCR) was commonly used for detecting DNA polymorphisms in many crop species. Random amplification of polymorphic DNA (RAPD) (Williams et al. 1990) and Amplified fragment length polymorphism (AFLP) (Vos et al. 1995) have been widely used as molecular markers in common buckwheat (Aii et al. 1998; Konishi et al. 2005; Kump and Javornik 1996; Matsui et al. 2004a; Murai and Ohnishi 1996; Sharma and Jana 2002; Suvorova et al. 1999; Yasui et al. 2004). In almost all cases, these two technologies generate dominant markers. Because dominant markers cannot distinguish between the homozygous and heterozygous states, it takes at least two generations, F2 and F3, to fix and confirm the genotype. Because of this, simple sequence repeats (SSRs) (Litt and Luty 1989) and expressed sequence tags (ESTs) were subsequently developed as codominant markers in common buckwheat, as well as in other crops (Konishi et al. 2006). SSRs have been used for buckwheat population genetics and assessment of genetic diversity (Iwata et al. 2005; Konishi and Ohnishi 2007; Ma et al. 2009), and EST markers were successfully used to identify candidate photoperiod sensitivity genes in this crop (Hara et al. 2011).

From 2010 onwards, next-generation sequencing (NGS) has been used in breeding and genetic studies of common buckwheat, and genome-wide genotyping by sequencing (GBS, Elshire et al. 2011) analysis was carried out to estimate the genetic diversity within this crop (Mizuno and Yasui 2019). This work revealed high genetic diversity (π = 0.0065) in cultivated common buckwheat and also indicated gene flow from wild to cultivated buckwheat. Bulked segregant analysis (BSA) using NGS technology has recently been used to develop codominant markers that can distinguish the Sh and s alleles of the S locus (Matsui et al. 2020). If a target trait is controlled by a small number of genes, as is the case for SI, BSA is a powerful way to develop molecular markers. However, if target traits are controlled by several genes, each with a small effect, quantitative trait locus (QTL) analysis is necessary. For QTL analysis, high-density genetic maps must be constructed, and NGS technology is now being used for this purpose, as discussed below.

Genetic maps

Many important agronomic traits, such as yield and quality, are controlled by QTLs, and a genetic map is now available for QTL analysis in buckwheat. Construction of a genetic map typically requires a segregating population. Because buckwheat is self-incompatible, an F2 segregating population cannot be produced using normal buckwheat. Pseudo-testcrosses (Grattapaglia and Sederoff 1994), which use F1 plants, are an alternative way to construct genetic maps for outcrossing species. Recently, this strategy was used to map genome-wide linkage array-based markers (8884 markers spanning 756 loci) onto eight linkage groups (the number of chromosomes in common buckwheat), and a QTL controlling main stem length was successfully identified (Yabe et al. 2014). A genetic map produced with pseudo-testcrosses was also successfully used for genomic selection (GS) to increase buckwheat yield (Yabe et al. 2018). Although the pseudo-testcross strategy is useful for linkage analyses, many F1 seeds must be obtained from a cross between two individuals. Furthermore, this strategy is limited to use for QTL analysis, because the evaluation of morphological traits is based on data from only each F1 plant.

SC lines enable the production of F2 lines and RILs. RILs are appropriate for QTL analysis, because trait data can be obtained by conducting several replications of morphological traits. Using an expressed sequence tag (EST) marker linkage map combined with several molecular markers, Hara et al. (2011) identified QTLs controlling photoperiod sensitivity in a segregating population derived from a cross between SC lines. In the near future, it will be possible to identify agriculturally important QTLs in buckwheat using RILs and high-density NGS-based maps.

Physical maps

Buckwheat has 2n = 2x = 16 chromosomes (Morris 1951), all of similar size (c.a. 2–5 μm; Zu et al. 1984; Sinkovič and Bohanec 1988), making it difficult to distinguish them by size alone. Several banding methods, including fluorescent in situ hybridization (FISH), have been developed to distinguish individual chromosomes in many plant species. In buckwheat, Kikuchi et al. (2008) performed in situ hybridization with 5S and 45S ribosomal RNA genes (rDNAs) and demonstrated that F. esculentum contains three 5S and three 45S rDNA loci. These loci allow us to identify five out of the eight pairs of the chromosomes by their rDNA pattern (Fig. 2). Li et al. (2013) detected nuclear genes of chloroplast origin by fluorescent chromosome in situ PCR and were able to distinguish many chromosomes not only in common buckwheat, but also in other Fagopyrum species, such as F. tataricum and F. cymosum. Sheng et al. (2013) used heterochromatin banding analysis and rDNA mapping to identify the individual chromosomes in all the trisomic lines that were developed by Chen et al. (2007). Integration of the physical and genetic maps for buckwheat would accelerate the breeding process by facilitating the use of cytological techniques, such as chromosome engineering with microscope and optical tweezers (e.g., Gao et al. 2019).

Fig. 2
figure 2

FISH images of the mitotic chromosomes of three buckwheat species, F. esculentum (a), F. tataricum (b), and F. cymosum (c) probed with 5S and 45S rDNA (Kikuchi et al. 2008). Green signal: 5S rDNA. Red signal: 45S rDNA. Bars (the small white lines between chromosome pairs in the images) = 10 μm

Construction of a buckwheat genome database (BGDB)

Genomic and cDNA libraries have been constructed for many plant species, including buckwheat, and have been used to isolate genes encoding target traits (e.g., Brkljacic et al. 1999; Fujino et al. 2001; Samardzic et al. 2004; Yasui et al. 2008). However, it takes a long time to construct these libraries and to screen them with specific probes. A genome database makes it easier to isolate a target gene and identify the sequence of a gene’s promoter region. Genome databases have been developed and widely used for the model plant Arabidopsis thaliana and for major crops, such as rice (Oryza sativa), maize (Zea mays), and barley (Hordeum vulgare). Recently, our research group has developed a buckwheat genome database (BGDB; http://buckwheat.kazusa.or.jp/) (Yasui et al. 2016).

As discussed above, buckwheat is a SI species and is highly heterozygous, which means that any given individual will have two different alleles—and thus two different genomic sequences—at many of its loci. To increase the proportion of homozygous regions in the genome, we used sib-crossing to develop a line of plants, BC1F6, from which we selected a thrum individual for constructing the BGDB. We chose a thrum plant because we were interested in analyzing the genome region containing the S locus and only thrum plants carry the S allele. After assembling Illumina short reads by SOAPdenovo2 (Li et al. 2010a), we determined 387,594 scaffolds as the draft genome sequence (FES_r1.0). The total length of FES_r1.0 was 1,177,687,305 bp, which is almost the same length estimated from cytometry analyses (1.34 Gb) (Nagano et al. 2000). Gene prediction and annotation analyses revealed 35,816 functional coding sequences (CDSs), excluding transposable elements. All assembled genome sequences and predicted genes can be downloaded from the BGDB, which also supports local BLAST searches and keyword searches for gene names and their annotations for identifying target genes.

Identification of genes controlling agronomically important traits

Identification of a gene using DNA sequence homology

Many major plant genes are conserved beyond the species level because similar traits are usually regulated by similar genetic systems, such as the genes encoding the flavonoid biosynthetic pathway. Because the BGDB had not been developed until recently, gene-specific primers and degenerate primers based on amino acid homology have been used to isolate genes from buckwheat (e.g., Yasui and Ohnishi 1996, 1998a, b; Fang et al. 2014, 2015; Li et al. 2010b, 2017; Matsui et al. 2018a; Tomiyoshi et al. 2012).

Degenerate primers are usually used to amplify genes from cDNA libraries because genomic DNA includes many non-expressed pseudo-genes, and it is important to identify only genes that are expressed in the organs of interest. However, it is possible to use genomic DNA as the template DNA. For example, Katsu et al. (2017) identified two dihydroflavonol 4-reductase (DFR) genes, one of which is strongly expressed in roots, using degenerate PCR primers and genomic DNA and cDNA as the template.

Local BLAST searches in the BGDB are a convenient method to search for genes by homology. For example, we have used the BGDB to identify some agronomically useful genes, such as those related to flavonoid biosynthesis (Matsui and Walker 2020) and those encoding 2S albumin-type allergens and granule-bound starch synthases (Yasui et al. 2016). Although the draft genome sequence (FES_r1.0) (Yasui et al. 2016) is truncated and divided into a large number of scaffolds, many genes that control agronomically important traits are easily identified using the gene annotations available in the BGDB.

Identification of a gene by finding specific protein

Proteins, including enzymes, are produced by transcription from mRNA. Thus, identifying a specific protein related to a target trait can be the first step toward identifying the gene controlling the trait. To identify a gene-related self-incompatibility in buckwheat, we examined the styles of each flower type, pin, and thrum, using 2D- PAGE analysis (Takeshima et al. 2019). We identified a short style-specific protein and identified the gene encoding the short style-specific protein (see “Heterostylous self-incompatibility and the development of self-compatible lines” section). Linkage analysis showed that the gene was not linked to the S locus, but we deduced that it would function downstream of the self-incompatibility reaction.

Identifying a protein often reveals genes functioning downstream of the associated traits, as in the case of the short style-specific protein. However, it sometimes represents a major advance that clarifies a mechanism. For instance, Nagatomo et al. (2014) identified an enzyme associated with flavone biosynthesis in buckwheat and clarified that the flavone glycosides are C-glycosylated before the skeleton of apigenin or luteolin flavones is completed.

Identification of a genomic region and expression of a gene related to target traits

Many important agronomic traits such as yield, excess water resistance, lodging resistance, pre-harvest sprouting, and photoperiod sensitivity are controlled by QTLs. However, little is known about the genes associated with these traits (Morishita et al. 2020), because of the difficulties of genetically analyzing these traits (see “Genetic maps” section). The development of genetic markers and linkage maps has accelerated genetic studies of these agronomic traits. As discussed in “Genetic maps” section, Hara et al. (2011) identified QTLs associated with photoperiod sensitivity by QTL analysis. Recently, they identified genes involved in the photoperiod pathway in the BGDB and mapped several of these genes to QTLs associated with photoperiod sensitivity, suggesting the possibility of using selection markers (Hara et al. 2020a).

By developing reference genome data and sequencing technology, it was possible to compare the genomes of different plant types. GBS is one approach for identifying a sequence difference, such as single-nucleotide polymorphisms (SNPs). GBS detects genomic DNA differences, such as SNPs, by sequencing genomic DNA that has been cleaved with restriction enzymes and amplified by PCR using a next-generation sequencer. Yasui et al. (2016) detected candidate genes on the S locus by comparing the genomes of pin and thrum plants using this technique. Genome-wide transcriptome analysis such as RNA-Seq is also possible due to developing sequencing technologies and is powerful tool for identifying candidate genes. Yasui et al. (2012) identified a candidate gene, S-ELF3, on the S locus using this technique. Furthermore, genes associated with aluminum tolerance have been identified by transcriptome analysis (e.g., Yokosho et al. 2014; Lei et al. 2017a, b; Che et al. 2018).

Genomics-based breeding methods such as genome-wide association study (GWAS) would be effective for identify regions that are related to the target traits. However, there is no report of GWAS for buckwheat, largely due to the lack of long-range linkage information such as pseudo-molecules of chromosomes. Establishing pseudo-molecules will enable the identification of agronomically useful genes in buckwheat. An SC line harboring alleles in the homozygous state at each locus would help confirm the phenotypes of the target alleles deduced by GWAS.

Demonstration of gene function

Through developing a genome database, it has become relatively easy to isolate a candidate gene that is expected to control a trait of interest. However, it is uncertain if the isolated gene indeed controls the trait. Therefore, it is necessary to demonstrate function using several methods.

In many cases, expression analysis of genes cannot demonstrate the role of a gene itself. However, expression data provide indirect evidence as to the function of the gene. Although expression analysis is conducted by RNA blot or quantitative-PCR (q-PCR), q-PCR is the major method used, due to its simplicity and convenience. Using this technique, the functions of many genes have been predicted.

To generate transgenic plants, overexpressing the target gene is a powerful approach for demonstrating the function of the gene. Because buckwheat is a self-incompatible plant, T2 seeds cannot be obtained by self-pollination after transformation. Furthermore, the reproductive ability of each plant would differ based on genotypic differences. For these reasons, genes identified in buckwheat are often transgenically expressed in different plant species, such as Arabidopsis and tobacco (Nicotiana tabacum) (Che et al. 2018; Matsui et al. 2018a). Although buckwheat cannot set self-pollinated seeds, there are some successful reports of transformation in buckwheat. Miljuš-Djukić et al. (1992) produced kanamycin-resistant buckwheat plants which carries the neomycin phosphotransferase II gene using Agrobacterium tumefaciens. Kim et al. (2010) reported the production of kanamycin-resistant hairy roots using A. rhizogenes. Furthermore, in planta methods of transformation based on needle infiltration with A. tumefaciens into the apical meristems of seedlings (Kojima et al. 2000a, b) and vacuum infiltration (Bratic et al. 2007) were reported. Chen et al. (2008) transgenically expressed a vacuolar Na+/H+ antiporter gene from Arabidopsis in buckwheat and noted that the plants were able to withstand high salinity conditions.

A stable transformation method to generate transgenic plants is a powerful approach for assessing gene function. In the case of buckwheat, the efficient production of transgenic buckwheat remains difficult, although transgenic plants have indeed been successfully generated, as mentioned above. Furthermore, the transformation process and maintenance of the established transgenic plants are time-consuming, and this approach therefore has limited use in large-scale studies.

A transient gene expression assay is an alternative, convenient method to investigate the function of genes in plant cells. For example, Lei et al. (2017a) clarified the subcellular localization of two ABC transporters, FeALS1 and FeALS2, by transferring GFP fusions of these genes driven by the 35S-CaMV promoter into buckwheat protoplasts.

Genes encoding transcription factors are often key regulators of important agronomic traits. Therefore, it is prudent to clarify the function of transcription factors. In many cases, a transcription factor regulates the transcription of several genes associated with a particular trait, such as flavonol biosynthesis. Reporter assays are an effective tool for clarifying if a transcription factor regulates the transcription of a particular target gene. To perform a reporter assay, two constructs must be prepared, one harboring the gene encoding the transcription factor (i.e., the effector) and the other a promoter region of the target gene fused to a reporter gene such as luciferase (i.e., the reporter). Instead of observing the cells using a microscope, the activity of the reported gene must be measured in many cells into which the gene has been introduced.

Matsui et al. (2018a) investigated whether the transcription factor FeMYBF1 regulates the transcription of genes encoding flavonoid biosynthetic enzymes in a transient assay using Arabidopsis leaves and protoplasts. Sakamoto et al. (2020) developed a transient assay involving buckwheat protoplasts. By applying this technology, the expression of relevant genes can be determined.

Future prospects

Efficient use of genetic resources and artificial mutation with SC lines

Buckwheat is grown throughout the world and can withstand various environmental conditions, particularly a wide range of temperatures and photoperiods. Buckwheat breeding has been done in many countries including Japan (Ohsawa 2020), but it still contains undesirable traits. The variety grown in each area is often referred to as the landrace or local variety. Landraces of common buckwheat possess high genetic diversity (Mizuno and Yasui 2019; Ohnishi 1993). Thus, numerous important recessive traits in common buckwheat are expected to be masked by dominant traits in the heterozygous state. Crossing SC lines with a landrace yields SC descendants, in which hidden useful traits may emerge. In addition, SC lines harboring a mutation induced by mutagens such as EMS and gamma rays are a suitable genetic tool for forward genetics. Thus, SC lines are very effective to produce new qualitative traits. A method, targeting induced local lesion in genomes (TILLING), is a powerful tool for identifying a mutated gene. Wheat lacking amylose in the endosperm, termed waxy wheat, has been produced using this method (Slade et al. 2005). This method should facilitate the identification of valuable alleles, such as non-allergenic or glutinous starch, from mutation-induced populations and landraces in buckwheat.

Quick production of a new variety fitting each area harboring valuable traits

Though SC buckwheat has many benefits as mentioned above, continuous self-pollination would induce inbreeding depression (Marshall 1979). Furthermore, buckwheat is strongly influenced by environmental factors such as photoperiod and temperature. Thus, developing SC lines suited to specific regions would be time-consuming. For SC buckwheat to be commercially viable, this challenge will need to be addressed. Therefore, we suggest a breeding method that combines both SI and SC varieties (Fig. 3). This method would enable the effective and rapid production of high-value varieties ideally suited for commercial use in each particular region. The SC variety could be used to produce lines possessing valuable traits such as non-allergenic and low amylose contents by homogenization of recessive dysfunctional genes. After identifying a valuable trait, the trait can be introduced into a leading SI variety through several rounds of backcrossing. We would expect that the newly developed variety would have similar agricultural traits (e.g., yield) as the leading variety, due to their similar gene structures and genetic backgrounds. Eventually, a new SI elite variety harboring the valuable trait will be obtained (Fig. 3).

Fig. 3
figure 3

Schematic diagram of a buckwheat breeding system using SI and SC plants. The values in parentheses indicate the percentage of genetic contribution from a parental line. S, allele controlling the thrum flower phenotype; s, allele controlling the pin flower phenotype; Sh, allele controlling the long-homostyle flower phenotype. Dominance relationship among the alleles: S > Sh > s. A, Dominant allele of the wild type; a, dysfunctional valuable allele; *, pin plant from the SI elite variety; **, SC long-homostyle (LH) plant obtained from mutagenesis and/or crosses between SI landraces and SC experimental plants, such as “Norin-PL1.” ***, thrum plants from the SI elite variety. In the BC4F1 generation, many thrum plants should be used to avoid inbreeding depression in later generations. MAS, marker-assisted selection

While selecting new SI elite varieties, novel SC varieties may also be established. If we select both a useful recessive (dysfunctional) gene and the Sh allele in backcross procedures, we can obtain SC lines harboring a valuable trait. In this approach, molecular marker-associated selection is an effective means of tracing the Sh allele. Molecular markers have already been established, as mentioned above (Matsui et al. 2020). After several rounds of selfing, if the SC lines do not show signs of inbreeding depression, they can be considered a new SC variety displaying stable high yield and a valuable trait.

Genomics-based breeding methods for improvement of buckwheat yields

Genomics-based breeding methods such as GWAS and GS have been implemented on many plant species (e.g., Korte and Farlow 2013; Huang and Han 2014). Yabe et al. (2018) reported that GS using 14,598–50,000 markers resulted in a 20.9% increase in selection index compared to the initial population over a 3-year period. Buckwheat production is low compared to that of major crops. We think that the main reason for low buckwheat production is its low yield (0.97 tons/ha, FAOSTAT, 2017). GS would be a suitable technique for improving buckwheat yields.

The development of F1 hybrid varieties

One might think that introducing a SC system in buckwheat would prevent the establishment of a F1 hybrid breeding system, which has been attained in many crop species. However, such a breeding system would be possible if SI could be genetically manipulated to be controlled by factors such as thermo- or photoperiod sensitivity. Fully understanding the SI system of buckwheat will pave the way for a F1 hybrid breeding system in buckwheat.

Alternatively, if thermo- or photoperiod-sensitive genic or cytoplasmic male sterility or a male sterile cytoplasm and its restorer gene were identified, this line could be used to establish an F1 hybrid breeding system. Indeed, thermo- or photoperiod-sensitive genic or cytoplasmic male sterility has been used to obtain F1 plants in many crops such as wheat (e.g., Murai 1998), rice (e.g., Chang et al. 2016; Zhou et al. 2016), and maize (e.g., Zhang et al. 2018). However, there are no reports of such cytoplasmic or genic male sterility in buckwheat to date. Cytoplasmic male sterility often becomes apparent in the descendants of interspecific crosses or crosses with the wild subspecies (e.g., Ahokas 1979). Two wild species, F. homotropicum and F. esculentum ssp. ancestrale, which are distributed in the southwestern part of China, are cross-compatible with cultivated buckwheat (Ohsako et al. 2017). Interspecific crosses between buckwheat and these wild species and/or use of genome editing (see next session) could yield such a cytoplasmic male sterile line.

Gene editing in buckwheat breeding

Genetically modified (GM) buckwheat would not be commercially viable in Japan. Buckwheat is viewed as a health-promoting food, and it is unlikely that there would be widespread consumer support for GM buckwheat, even if the plants are considered safe for human consumption. However, lines produced by mutagenesis can be used as non-GM plants in Japan. Gene editing techniques that can target and mutate a gene of interest have been developed in model and major plant species (Knott and Doudna 2018). As this technology can often be used to eliminate the function of the target gene, it could be a suitable strategy for buckwheat breeding efforts. To our knowledge, there is no report of successful gene editing in buckwheat. However, we predict that this technique will be developed in buckwheat in the near future. Because this technique can mutate several target genes simultaneously, it would be a highly effective means of producing non-allergenic buckwheat lines, as buckwheat contains several allergenic proteins (Park et al. 2000; Satoh et al. 2020). A SC line would be a powerful tool for developing this technique, as it would allow self-pollinated seeds to be obtained.

Other possibilities

Furthermore, buckwheat contains many beneficial chemical components in addition to flavonoids (Kawa et al. 2003; Wijngaard and Arendt 2006; Gimenez-Bastida and Zielinski 2015), such as fagopyritols, which are thought to have anti-diabetic effects when ingested (Steadman et al. 2000). To generate high-value lines, it will be important to identify the underlying genes using molecular techniques and available data, such as that housed in the BGDB.