5.1 Introduction

Each cell’s nucleus contains a certain number of DNA molecules in each species. DNA molecule is tightly coiled many times around proteins in chromosome structure. An individual chromosome becomes visible under a microscope only during cell division. The organization of genes, tandem repeats, retrotransposons, satellite DNA, and other chromosomal sequences related to the frequency of recombination can be established on the cytological study. In situ chromosome analysis of the gene and other sequence locations can provide a great deal of information for many aspects of genetics and genome assembly that can fully assist plant breeding.

Most Alliums are diploids (2n = 2x), with basic chromosome numbers x = 8 (Eurasia and Mediterranean basin), x = 7 (North America), or x = 9 (Eurasia) (Havey 2002). Polyploids such as triploids (A. rupestre, A. scordoprasum), tetraploids (A. ampeloprasum, A. chinense, A. nutans), pentaploids (A. splendens), hexaploids (A. lineare), and octoploids (A. nutans) also occur (Jones and Rees 1968; Jones 1990; Konishi et al. 2011). Genome sizes display 4.5-fold differences among Allium species from 7 pg/1 C in A. altyncolicum (Ricroch et al. 2005) to 31.49 pg/1 C in A. ursinum (Ohri 1998). Bulb onion (Allium cepa, 2n = 2x = 16) has one of the largest genomes among crop plants, with its 17 pg (16 Gb) haploid nuclear genome being more than 100-fold larger than that of Arabidopsis. The huge genome of bulb onion is placed in 8 large chromosomes and that made this plant a good model in classical cytogenetics. However, from positions of modern molecular cytogenetics, onion is a difficult subject for scientific study. The mean condensation of an onion mitotic metaphase chromosome has been estimated at 249.6 Mb/μm, assuming uniform condensation along the entire chromosome (Khrustaleva and Kik 2001). This is much greater than the condensation of the human chromosome at 26.6 Mb/μm (Alberts et al. 1989) or tomato chromosomes at 40.6 Mb/μm (Anderson et al. 1985). On an average, one onion chromosome possesses the same amount of DNA as the entire diploid genome of tomato. Chromatin condensation in onion chromosomes remains a riddle in biology.

5.2 Allium Chromosome Nomenclature and Homoeology Relationships

Participants in a workshop on the onion chromosome nomenclature held at the University of Warwick, Coventry, U.K. on 8 September 1988 during the Eucarpia 4th Allium Symposium agreed with Kalkman’s nomenclature (de Vries 1990). Since then, the nomenclature system proposed by Kalkman (1984) has provided the standard identification procedure of A. cepa chromosomes. Cytogeneticists who have studied the karyotype of A. cepa in several varieties of LD, SD and multiplier genotypes including shallots largely agreed to those of Kalkman (Peffley and Currah 1988; de Putter and van de Vooren 1988; de Vries and Jongerius 1988).

For six Allium species, de Vries and Jongerius (1988) produced karyotype idiograms based on morphometry and C-banding with the chromosome nomenclature according to Kalkman’s standard system (Table 5.1). For these species, chromosomes with the same rank order number are not necessarily homoeologues. The chromosomes known to be homoeologous are the nucleolar organizing A. cepa and A. fistulosum chromosomes 6 because these satellite chromosomes form bivalents clearly sharing one nucleolus in pachytene and early diplotene of an interspecific hybrid between A. cepa and A. fistulosum (Jones and Rees 1968). Allium cepa and A. roylei possess two 5S rDNA loci, both localized on the short arm of the smallest metacentric chromosome 7 (Shibata and Hizume 2002; Fredotovic et al. 2014). Allium fistulosum possesses one 5S locus on the short arm of the smallest metacentric chromosome 7 (Son et al. 2012; Kirov et al. 2017). Thus, the smallest metacentric chromosomes 7 of A. cepa, A. roylei and A. fistulosum are homoeologues. The homoeology relationships between the remaining six chromosomes should be proved in further cytogenetic studies of the genus Allium.

Table 5.1 Relative chromosome length and centromeric index of the six Allium species (produced from the data published by de Vries and Jongerius 1988)

5.3 Genes, Genetic Markers, and Linkage Group Assignment to the Chromosomes Via Alien Monosomic Addition Lines

Alien monosomic addition lines (AMALs) contain only one single chromosome of an alien donor species in addition to the entire chromosome complement of the recipient species. The genetic analyses of AMALs are an effective approach to allocate genes and genetic markers to physical chromosomes in wheat.

A unique resource—a complete set of monosomic alien addition lines, representing the eight different chromosomes of shallot (A. cepa L. Aggregatum group) in an A. fistulosum background (2n = 17, 16FF +C1-8) have been developed (Shigyo et al. 1996, 1998). Chromosome-specific genetic markers (ten isozymes, 5S rDNA and 16 RAPD markers) (Shigyo et al. 1994, 1995a, b, 1996, 1997), genes related to flavonoid and anthocyanin production (Shigyo et al. 1997), genes involved in flavonoid biosynthesis (partial sequence of candidate genes CHS-A, CHS-B, CHI, F3H, DFR, ANS), and 3GT gene for glycosylation of anthocyanidin (Masuzaki et al. 2006) have been assigned to the individual shallot chromosomes (Table 5.2).

Table 5.2 Chromosome specific genetic markers of Allium cepa L. determined via AMALS

Using the AMALs, linkage groups based on A. cepa-markers developed via the genetic analysis of an interspecific cross between A. cepa and A. roylei with the AFLP marker were assigned to the individual physical chromosomes of A. cepa (van Heusden et al. 2000). The onion and shallot accessions used in this study were genetically not very different. More than 90% of their AFLP fragments were equally sized in both the accessions. Fifty-one AFLP markers of A. cepa that were absent in A. roylei were distributed over all eight linkage groups, thereby allowing the assignment of these groups to chromosomes. These 51 markers with their known chromosomal origin, determined with the set of AMALs, were used to find any correspondence between the linkage groups and the physical chromosomes. The availability of a complete set of AMALs allowed the detection of 186 A. cepa chromosome-specific AFLP markers. Further 74 codominant onion EST-derived markers were evaluated in the A. cepa x A. roylei interspecific population, enabling the merging of the AFLP-based maps (McCalum et al. 2012).

The AMALs were used to assign the onion linkage groups constructed on the intraspecific segregating population to physical chromosomes (Martin et al. 2005). The low-density intraspecific onion genetic map derived from the BYG15-23 × AC43 segregating family based primarily on restriction fragment length polymorphisms (RFLPs) composed of 116 markers was developed (King et al. 1998). Later 100 new expressed sequence tag (EST)-derived markers were added to the onion genetic map of King et al. (1998) to produce the most detailed intraspecific map to date, encompassing 1907 cM, and the linkage groups were assigned to onion chromosomes (Martin et al. 2005).

Thus the AMALs developed by Shigyo and his colleagues is the key resource that has enabled alignment of Allium genetic maps to physical chromosomes and facilitated a comparative study among species (McCalum et al. 2012). The AMALs were also used to anchor SSR-based maps in A. fistulosum to physical chromosomes (Tsukazaki et al. 2008). Using the AMALs and allotriploid-bunching onion single alien deletions, 513 EST-derived markers were assigned to the eight chromosomes of A. fistulosum and A. cepa (Tsukazaki et al. 2010). Data concerning markers assigned in multiple studies to the Allium physical map using AMALs have been compiled and these data reveal extensive synteny between A. cepa and A. fistulosum (McCalum et al. 2012).

AMALs of A. cepa carrying extra chromosomes from A. roylei (RR, 2n = 2x = 16) were constructed and all eight possible types of A. cepaA. roylei monosomic addition lines (CC+R–CC+8R) were identified using the analyses of isozymes, EST markers, and karyotypes (Hoa et al. 2012). The availability of this AMALs for scientists and breeders will extend our knowledge about the A. roylei genome and genetics and will improve the introgression of desirable genes from A. roylei into A. cepa.

5.4 In Situ Direct Mapping Genes/Markers on Allium Chromosomes

How to visualize genes/markers and other DNA sequences on physical chromosomes? Fluorescence in situ hybridization (FISH) provides researchers with a way to visualize and map the DNA sequence in a specific position of an individual chromosome. FISH combines the chromosome preparation of classical cytogenetics with recombinant DNA technology. The hybridization site of the labeled DNA probe on chromosomal DNA is visualized using a fluorescent reporter molecule directly or indirectly bound to the probe.

5.4.1 rRNA Encoding Genes Organized in Tandem Arrays

The ribosomal RNA (rRNA) genes are arranged in hundreds or thousands of tandem repeats and are highly conserved between species. This chromosome region is the most widely investigated by FISH since the positions of these sites are well known in the hundreds of plant species. In higher eukaryotes, rRNA genes are organized as two distinct multigene families comprised of tandemly arrayed repeats. One family is represented by the 45 s rDNA which consists of a transcriptional unit that codes for the 18S, 5.8S, and 28S rRNAs, and an intergenic spacer (IGS). Multiple copies of 45 s rDNA correspond to the nucleolar organizer regions (NORs). The other family codes for the 5S rRNA and consists of a highly conserved coding sequence of 120 base pairs (bp) which is separated from each coding unit by an IGS. The 5S rDNA is not normally associated with NORs (reviewed in Long and David 1980). The 45S rDNA sequences were shown to be on the NOR of the satellite chromosomes and the 5S rDNA to be located, generally, on one of the other chromosomes (Gerlach and Dyer 1980; Leitch and Heslop-Harrison 1992). Due to FISH technology, it was found that the 45S rRNA genes are located not only on the site of secondary constriction (NORs) but on others sites where they were not previously detected (Guerra et al. 1996). It is known that the secondary constrictions represent only the expression of rRNA genes which were active during the last interphase and that other functional sites may not form secondary constrictions, especially if located too close to the terminal end of the chromosomes (Vanzela et al. 2003; Roa and Guerra 2012).

In A. cepa, the 45S rDNA were located on short arm of NOR-bearing chromosome 6 and at the distal region of the short arm of the smallest chromosome 8 (Ricroch et al. 1992; Do et al. 2001; Mancia et al. 2015) (Fig. 5.1a). In A. fistulosum, FISH probing with the 18S rDNA (Hizume 1994) and 45S rDNA (Kudryavtseva and Khrustaleva 2018 unpublished data) (Fig. 5.1b) revealed the hybridization signal only on the short arm of NOR-bearing chromosome 6. In A. wakegi (2n = 2x = 16), a natural allodiploid hybrid between A. cepa and A. fistulosum, probing with 18S rDNA (Hizume 1994) and 45S rDNA (Kirov and Khrustaleva 2018 unpublished data) (Fig. 5.1c) revealed the fluorescent signals on three chromosomes, at secondary constrictions of chromosomes 6 in A. cepa and A. fistulosum and on the short arm of chromosome 8 of A. cepa. The parental origin of chromosomes was proved by genomic in situ hybridization (GISH) (Hizume 1994). The 5S rDNA genes have been mapped to two loci (proximal and interstitial) on the short arm of chromosome 7 of A. cepa (Hizume 1994; Lee and Seo 1997) (Fig. 5.1d). Shibata and Hizume (2002) analyzed the structure and chromosomal location of different 5S rDNA subunit of A. cepa using microdissection and FISH. The authors dissected separately the proximal and distal segments of 5S rDNA and used them as templates for PCR. They showed that the long 5S rDNA unit was only present distally and the short unit was predominantly located proximally on the short arm of chromosome 7. In A. fistulosum, the 5S rDNA genes have been mapped to one locus on the short arm of chromosome 7 (Hizume 1994; Do and Seo 2000; Kirov et al. 2017) (Fig. 5.1e). The FISH study of 15 Allium species showed that the 5S rDNA were primarily located on chromosomes 5 and/or 7 in diploid species and various chromosomes in alloploid species (Do and Seo 2000).

Fig. 5.1
figure 1

Application of FISH in the Allium genome research. a Two-color FISH probing with 45S rDNA (green) and 5S rDNA (red) in A. cepa. Photo by I. Kirov. b FISH probing with 45S rDNA (green) in A. fistulosum. Photo by N. Kudryavtseva. c FISH probing with 45S rDNA (red) in A. x wakegi. Photo by I. Kirov. d FISH probing with 5S rDNA (green) in A. cepa. Confocal microscopy photo by D. Romanov. e Two-color FISH probing with 5S rDNA (green) and HAT58 (red), the arrows indicate polymorphic sites of HAT58 on the long arm of chromosome7. Photo by I.Kirov f BAC-FISH probing with the clone possessing of the LFS gene insert. C0t-100 fraction was used as a block. Photo by L. Khrustaleva. g Tyr-FISH probing with cocktail of SNPs markers tightly linked to the Ms locus. Photo by L. Khrustaleva. h FISH probing with HAT58, the arrows indicate polymorphic sites of HAT58 on the long arm of chromosome7. Photo by I. Kirov (i) Multicolor GISH in the first generation of bridge-cross (A. cepa x (A. roylei x A. fistulosum)), A. roylei (green), A. fistulosum (red) and A. cepa (block DAPI, blue). Photo by L. Khrustaleva

5.4.2 Unique Genes Mapping on Highly Condensed Allium Chromosomes

5.4.2.1 BAC-FISH Mapping

While rRNA genes organized in tandemly arrayed repeats can be easily localized by FISH, unique genes have not been readily mapped on physical chromosomes due to the technical challenge of visualizing a small target DNA. In plant species with small gene-rich genomes, such as Arabidopsis thaliana (Koornneef et al. 2003) or rice (Jiang et al. 1995), detection of specific loci can be accomplished by FISH with large genomic Bacterial Artificial Chromosome (BAC) clones as probes. Unfortunately, the onion genome contains families of abundant repetitive elements (Stack and Comings 1979; Pearce et al. 1996). So, any clone that includes a copy of a repetitive DNA will hybridize across the genome, making it unsuitable as a FISH probe. Overcoming the technical difficulties for the first time, BAC clones possessing the lachrymatory factor synthase (LFS) gene were mapped on the A. cepa chromosome (Masamura et al. 2012) (Fig. 5.1f). The C0t-100 fraction was used to block the repetitive sequence on a target DNA. Two totally sequenced BAC clones, 2E8/10 and 4F10/155, were hybridized to the mitotic metaphase chromosome. Sequence comparison of two BAC clones bearing LFS genes, LFS amplicons from diverse germplasm, and expressed sequences from a doubled haploid line revealed variation consistent with duplicated LFS genes. The BAC-FISH study showed that these BAC clones are co-localized in the proximal region of the long arm of the chromosome 5. However, the clones can be distant from each other up to 25 Mbp and still be located in the same position because of high compactization of mitotic metaphase chromosome and the resolution limit of a conventional light microscope. The results suggested that LFS in A. cepa is transcribed from at least two loci and that they are localized on chromosome 5. Genetic mapping of polymorphisms detected by heteroduplex analysis of LFS amplicons in the A. cepa x A. roylei interspecific cross revealed co-segregation with markers linked to chromosome 5 (Masamura et al. 2012). The position of LFS on the genetic map was linked with its position on the physical chromosome (Fig. 5.2b).

Fig. 5.2
figure 2

Integration of the gene position on physical chromosome into recombination map. a Ms locus visualized on the long arm of chromosome 2 of A. cepa using tyr-FISH. The chromosome extracted from metaphase on Fig. 5.1g. b LFS locus visualized on the long arm of chromosome 5 of A. cepa using BAC-FISH. The chromosome extracted and straightened from metaphase on Fig. 5.1f

5.4.2.2 ESTs Mapping

ESTs are attractive candidates for chromosomal gene mapping because they possess protein-coding sequences and often do not contain dispersed repetitive DNA sequences that may complicate FISH signals. These probes do not require blocking of dispersed repeat sequences by Cot fractions. Unfortunately, detection of such small unique sequences on plant chromosomes has been difficult because the length of the target chromosomal DNA that can be routinely visualized by FISH is 10 kb (Jiang et al. 1995; Jiang and Gill 2006), which is longer than the average gene length of 2.5–4.0 kb. To overcome the FISH sensitivity limitation, Raap et al. (1995) introduced the use of fluorescent tyramide conjugates as substrates for Horse Radish Peroxidase (HRP) into FISH technology. The technique combines the advantage of an enzymatic procedure that provides signal amplification due to the deposition of many substrate molecules, and that of fluorescence-based detection, which is higher than absorbency used in enzymatic detection. With this method, the detection sensitivity can be increased up to 100 times compared to the conventional FISH procedure. Khrustaleva and Kik (2001) have adapted for plant cytogenetics this ultrasensitive FISH method termed tyramide-FISH (Tyr-FISH). The authors were able to visualize the position of T-DNA inserts as small as 710 bp in transgenic shallots. Another problem faced by researchers is the presence of a cell wall in plants that hampers the availability of the target DNA sequence for the hybridization probe. The chromosome preparation procedure has a very strong impact on chromatin accessibility and short probe detection. Recently, a novel method named “SteamDrop” for the preparation of high-quality well-spread mitotic and pachytene chromosomes of plants was developed (Kirov et al. 2014, video of “SteamDrop” method available at www.plantgen.com). The sequence information about expressed sequence tags (ESTs) and genes that have been sequenced or partially sequenced are available publicly from GenBank at the NCBI (http://www.ncbi.nlm.nih.gov/genbank/) for the construction of primers or oligonucleotides to produce a DNA probe for in situ mapping. Using Tyr-FISH, the EST-clones API15 (GenBank Accession Number: BE205550.1), API59 (BE205590.1), API23 (BE205556.1), API92 (BE205605.1), and API66 (BE205593.1) were visualized on chromosome 5 of A. cepa, which carries several QTLs and desirable genes (Romanov et al. 2015). These EST clones were previously mapped in the same linkage group that was assigned to chromosome 5 (King et al. 1998; Martin et al. 2005). The position of these markers on the genetic map was integrated with their physical position on a chromosome. Through the integration of genetic and cytogenetic maps, the distribution of recombination events along onion chromosome 5 was estimated. The highest base pair/centimorgan estimates were (0.8 Mb/cM) between the markers API59-H3-15.0/9.5, API92-E1-11.0/12.0, and API66-E5-6.7/9.5 located in the interstitial region and over 20 times less (21.8 Mb/cM) for API23-H3-12.0/6.5 and API15-E1-3.0 markers located in the proximal region (Romanov et al. 2015).

5.4.2.3 Markers Tightly Linked to the Male Fertility Restoration Locus (Ms) of Onion

The importance of knowledge about the gene position on the physical chromosome was clearly demonstrated in the work on the Tyr-FISH mapping of markers linked to Male Fertility Restoration Locus (Ms) in A. cepa (Khrustaleva et al. 2016). Hybrid onion seed is commonly produced using cytoplasmic male sterility (CMS). Seed propagation of male-sterile plants (S msms) is possible by crossing with maintainer (N msms) plants (Jones and Davis 1944) and selection of superior maintainer lines is a primary focus of hybrid onion breeding programs. To more quickly determine genotypes at Ms, several research groups have identified molecular markers from the genome or transcriptome showing linkage to Ms (Gökçe et al. 2002; Gökçe and Havey 2002; Huo et al. 2012; Yang et al. 2013; Bang et al. 2013; Havey 2013; Park et al. 2013; Kim 2014). Many markers linked to Ms have been identified even though in some cases relatively few clones or primers were screened. It is known that genetically close markers may actually be far apart in terms of base pairs (or vice versa) due to differences in the frequency of recombination along the length of a chromosome. Thus, in chromosome regions experiencing relatively low recombination showing tight genetic linkage may be quite physically distant from each other. Tyr-FISH probing with the markers linked to Ms (SNPs and RFLP) revealed the proximal locations of these markers close to the centromere on the long arm of chromosome 2 (Khrustaleva et al. 2016) (Fig. 5.1g), a region of lower recombination (Albini and Jones 1988). The position of Ms on the genetic map was linked with its position on the physical chromosome (Fig. 5.2a). Four markers were co-localized at a relative position from the centromere −0.1 ± 0.02 on high condense mitotic metaphase chromosome. On super-stretched pachytene chromosomes, four markers were visualized as a linear string of fluorescent signals measuring 7.4 ± 0.6 µm. If the lengths of super-stretched pachytene chromosome are assumed to be 20 times longer than regular pachytene chromosomes and correspond to 1.5 Mb/µm (Koo and Jiang 2009), the markers would be located across a 10-Mb region. This does not diminish the usefulness of these molecular markers to predict genotypes at Ms; however, it does indicate that eventual map-based cloning of Ms may be arduous. As the cost of DNA sequencing continues to decline, the nuclear genome of onion will eventually be sequenced and assembled. Nevertheless, identification of candidates for Ms may be difficult because flanking markers may not locate onto a single contig.

5.5 Chromosomal Organization of Repetitive DNA Sequences

5.5.1 Tandem Repeats

A tandem repeat in DNA is two or more adjacent, approximate copies of a pattern of nucleotides. Tandem repeats are associated with important chromosomal landmarks such as centromeres, telomeres, subtelomeric, and other heterochromatic regions and have been widely studied during the last decades (Henikoff et al. 2001; Jiang et al. 2003; Koo et al. 2011). The first publication on chromosomal distribution of tandem repeats in Alliums was written by Barnes et al. (1985). A 375-bp fragment was isolated from the BamHI digest of A. cepa genomic DNA that was cloned and sequenced. The tritiated plasmid DNA was hybridized to metaphase chromosomes and detected on autoradiographs. The 375-bp tandem repeats were located at the telomeric ends of all chromosomes except on the short arm of NOR-bearing chromosomes. The 375-bp tandem repeats constitute about 4% of the A. cepa genome. Later, Irifune et al. (1995) isolated a 380 bp DNA sequence in EcoRV digests of the total genomic DNA of A. fistulosum and showed with FISH that this tandem repeat had the same chromosomal localization as the 375 bp repeat in A. cepa. Moreover, the 380 bp DNA sequence of A. fistulosum had 82% homology with the 375 bp repeat of A. cepa. A copy number of the 380 bp tandem repeat was estimated about 2.8 × 106 per haploid genome of A. fistulosum (Irufine et al. 1995) Using FISH and PCR analysis of the A. fistulosum genome, Fesenko et al. (2002) showed that the 380 bp repeat arrays possess inversions and are interspersed with microsatellite and Ty1-copia retrotransposon sequences. A 314-bp tandemly repeated DNA sequence, named pAc074, with high homology to the 375 bp repeat sequence was isolated by PCR with a set of random primers and FISH located it at the telomeric end of the A. cepa chromosomes (Do et al. 2001). FISH with the 375 bp repeat sequence probe derived from A. cepa on the chromosomes of 27 species (in 37 accessions) belonging to 14 sections of four subgenera in Allium showed that the analyzed closely related species possessed a very similar satellite sequence at the telomeric end of their chromosomes (Pich et al. 1996a). Taking all things together, we may conclude that this repeat had evolved already in progenitor forms of Alliums and remained unusually well conserved during speciation.

FISH analysis of BAC clones containing about 100 kb inserts of genomic DNA of A. cepa allowed to select, among the 91 randomly selected clones, nine clones showing FISH signals at centromeric and proximal regions and three clones showing telomeric signals (Suzuki et al. 2001). Considering that they did not use Cot-1 as a competitor DNA in the BAC-FISH experiments, the BAC clones with distinct localized signals might possess large arrays of tandem repeats.

Tandem repeats are a valuable source of cytogenetic markers for distinguishing individual chromosomes (Albert et al. 2010). The repeatome analysis of A. fistulosum genome allowed to identify novel tandem repeats in A. fistulosum genome that can be used as cytogenetic markers (Kirov et al. 2017). Using next-generation sequencing data, authors identified two novel tandem repeats HAT58 and CAT36, which together occupy 0.25% of the A. fistulosum genome with 160,000 copies of HAT58 and 93,000 copies of CAT36 per haploid genome. FISH analysis showed that CAT36 is located in the pericentromeric regions of chromosomes 5 and 6 of A. fistulosum. HAT58 occupied intercalary heterochromatin of chromosome 6, 7, and 8 associated with C-banding patterns (Fig. 5.1h). Moreover, FISH with HAT58 revealed that this tandem repeat is polymorphic because plants with three type location patterns of this repeat were observed: both homologous chromosome 7 with signals, only one homolog with signals and missing signals on both homologs. Thus, HAT 58 might quickly spread to new genomic regions resulting in polymorphic sites. This finding suggests that the rapid evolution of the HAT58 repeat is still ongoing. HAT58 and CAT36 are species-specific tandem repeats that were shown for two closely related species A. cepa and A. fistulosum, and A. wakegi (2n = 2x = 16), a natural allodiploid hybrid between A. fistulosum and A. cepa (Kirov et al. 2017).

5.5.2 Retrotransposons

Retrotransposons are the commonest class of eukaryotic transposable elements. They are distinct from other transposons by their ability to transpose via an RNA intermediate, which is converted into extrachromosomal DNA by reverse transcription before reinsertion. Tyl-copia group retrotransposons are present throughout the plant kingdom as highly heterogeneous populations of high copy number elements (Kumar 1996), and there are 100,000-200,000 copies within the A. cepa diploid genome (Pearce et al. 1996). FISH to metaphase chromosomes reveals that Tyl-copia retrotransposons are distributed throughout the euchromatin of all chromosomes of A. cepa but are enriched in the terminal heterochromatic regions, which contain tandem arrays of satellite sequences (Pearce et al. 1996; Pich and Schubert 1998).

5.6 Centromere

The centromere is essential for the proper segregation and inheritance of genetic information. Visually, the centromere appears on metaphase chromosomes as a primary constriction. The DNA sequence underlying the centromere is not evolutionarily conserved and, in most species, is composed of megabases of rapidly evolving tandem repeats (Melters et al. 2013). In all flowering plants investigated so far, the centromere is generally composed of large arrays of centromeric satellite repeats and centromeric retrotransposons. The abundance and the arrangement of these repeats vary substantially, both within and among species (Jiang et al.2003; Nagaki et al. 2004, 2009; Nagaki and Murata 2005). The satellite sequences that occur within the centromeres of most eukaryotes are usually species-specific (Houben and Schubert 2003). Among Allium species, centromeric DNA sequences were identified only for A. fistulosum by Chromatin Immuno Precipitation (ChIP) and Tail-PCR (Nagaki et al 2012). Three clones with inserts of CHIP isolated centromeric sequences produced FISH signals on the centromere position of all 16 chromosomes of A. fistulosum (Fig. 5.1). These clones were sequenced and their sequences are available in GenBank: Afi11 (AB735740), Afi19 (AB735741), and Afi56 (AB735743). The authors revealed two more clones that produced the centromeric FISH signals but not on all 16 chromosomes. For Afi54 (AB735742), 12 strong and four weak signals, and for Afi61 (AB735744), one strong signal pair and 14 weak signals on the centromeres were observed. The DNA sequence of these five clones did not show similarity to annotated genomic sequences in NCBI databases.

The centromeric and pericentromeric regions of plant chromosomes are colonized by Ty3/gypsy retrotransposons. Centromeric retrotransposons belong to a number of lineages of the chromo-virus family of Ty3/gypsy LTR (long terminal repeat) retrotransposones (Neuman et al. 2011). Centromeric retrotransposons are found between the centromeric satellite repeats (Cheng et al. 2002) and they also can take the major role in the centromeric structure (Li et al. 2013). The presence of Ty3/gypsy-like retrotransposons in the centromeric region of Allium cepa and Allium fistulosum was reported by Kiseleva et al. (2014). The putative copy number of Ty3/gypsy centromeric retrotransposons constituted about 26,000 for A. cepa and about 7000 for A. fistulosum. In silico identification of centromeric retrotransposons has also been performed, followed by their clustering. Using the NCBI Entrez in the NCBI database, a total of 10,725 GSSs (genome survey sequences) of A. cepa were identified. These sequences were used as query data for the RepeatExplorer server. FISH with PCR product obtained with designed primers on reverse transcriptase of Beetle1 and CRM Ty3/gypsy elements showed strong hybridization signals in the centromeric regions of A. cepa. FISH on the chromosomes of A. fistulosum showed hybridization signals of different intensity in the centromeric region, as well as in other chromosomal regions. The estimation of a retrotransposon insertion time in the onion genomes suggests a high activity of some of LTR retrotransposons in their recent history (Vitte et al. 2013). It is still unclear why some retrotransposons insert selectively into the centromere region. A possible explanation of this phenomenon may be that the centromeric region of a chromosome provides a safe environment for retrotransposons reducing the chance of their elimination from the genome via recombination, which is suppressed in this region.

5.7 Telomere

The ends of eukaryotic chromosomes are capped by a special structure called the telomere. The telomere protects the termini of eukaryotic chromosomes from degradation by nucleases, illegitimate fusion, and progressive shortening as a result of incomplete replication of linear DNA molecules at their 5′-ends. The DNA component of the telomere is typically formed by long arrays of a G-rich tandemly repeated short minisatellite sequence that, depending on the organism, extend for tens of base pairs to as much as 150 kilobase pairs. In contrast to the centromere, the telomere is highly conserved DNA sequence in large groups of organisms, e.g., TTAGGG in vertebrates (Cheng et al. 1989; Meyne et al. 1989), TTTAGGG in plants (Richards and Ausubel 1988) and TTAGG in insects (Okazaki et al.1993). The strong conservation of telomeric repeats is likely a result of the interaction between telomeric DNA and telomere-specific binding proteins (Watson and Riha 2010). Surprisingly, telomere composed of (TTTAGGG)n DNA repeats typically for most of the plants was absent in the Alliaceae family. For decades, scientists have been trying to find out how Alliaceae stabilize their chromosome ends in the absence of TTTAGGG sequences. Candidates for alternative telomeric sequences included ribosomal RNA genes or subtelomeric satellite sequences which could be spread by homologous recombination were suggested (Pich et al. 1996b). A possible involvement of Ty1-copia retrotransposons was also suggested for Allium (Kumar et al. 1997), but the subsequent study did not support this idea (Pich and Schubert 1998). To clarify the enigma of the Allium telomere maintenance, extensive studies of telomere evolutionary variability have been performed in Asparagales plants. In a number of families in this order, starting from the divergence of the Iridaceae family, the typical plant-type telomeric sequence (TTTAGGG)n had been partly or fully replaced by the vertebrate-type sequence (TTAGGG)n synthesized by telomerase (Sykorova et al. 2003, 2006a, b). However, the genus Allium was an exception where neither of the plant-type repeats or their known variants could be detected at chromosome termini, and a corresponding telomerase activity was absented as well (Fajkus et al. 2005; Sykorova et al. 2006a). Finally, in 2015 the group of professor Jiří Fajkus reported that Allium telomeres are unmasked. The unusual telomeric sequence (CTCGGTTATGGG)n is synthesized by telomerase (Fajkus et al. 2016). Due to recent advances in bioinformatics on transcriptomic and genomic data in combination with conventional approaches of molecular biology and molecular cytogenetics, the researchers succeeded in finding the unusual telomeric sequence of Allium and demonstrate its synthesis by telomerase.

5.8 Integration of Recombination, Cytogenetic and Sequence (Contigs) Maps

Since King et al. (1998) published the first genetic (recombination) map of Allium species, AFLP, SSRa, and EST-derived SNP markers have been employed to increase its marker density (McCallum et al. 2012). The densified genetic maps considerably improved the efficiency of the breeding process and expanded our knowledge on the onion genetics. The genetic maps display the linear order of genes or markers and the recombination frequencies between them. The distance between markers is expressed in centi-Morgans (cM) and 1 cM is equal to 1% of recombination. While genetic maps are important in biology, they do not show the real physical distance between gene/markers due to unequal distribution of recombination frequencies along the chromosomes. One cM on a genetic map can be equivalent to a few kilobases as well as to millions of base pairs of physical distance (Khrustaleva et al. 2005; Sun et al. 2013; Si et al. 2015; Romanov et al. 2015).

Cytogenetic maps allow us to determine the approximate distance between genetic markers, but not the exact distance (number of base pairs). Cytogenetic maps show the positions of genetically mapped markers on chromosomes, relative to centromeres, telomeres, heterochromatin, and euchromatin. It is important to note that cytogenetic mapping is an essential tool for the ordering of markers in region where there is the absence of recombination. The discrepancy between marker positions on the genetic map and their actual location relative to each other on the physical chromosome in the region of the suppressed recombination often occur (Szinay et al. 2008). Multiple approaches have been used to develop the integrated genetic and cytogenetic maps. For instance, in wheat, 436 deletion lines were constructed to relate the deletions as landmarks to recombination (Sandhu et al. 2001). However, this approach cannot be used for diploid species as onion because diploid organisms do not tolerate large deletions. To overcome this problem, Khrustaleva et al. (2005) applied a novel strategy to relate markers with their position on a physical chromosome. For the development of an integrated map, the authors used Allium trihybrid population, which originated from a cross between Allium cepa and (A. roylei x A. fistulosum). This population represents an ideal source for integrated mapping, because in each chromosome pair, one homoeologous chromosome originates from the interspecific hybrid between A. roylei and A. fistulosum and another nonrecombinant homoeologous chromosome originates from A. cepa (Fig. 5.1i). The recombination sites on the recombinant chromosome between A. roylei and A. fistulosum were visualized via GISH. The AFLP profiles of individual genotypes were compared with the corresponding recombinant chromosome. Simultaneously for A. roylei and A. fistulosum, integrated physical and recombination maps of chromosome 5 and 8 were constructed. The integration of genetic and chromosome maps demonstrated how genetic and physical distance between markers varied depending on the marker position on physical chromosome: the base pair/centimorgan estimates were 1.4 Mb/cM in the hotspot recombination and 74.3 Mb/cM in the region of low recombination.

However, with an above mentioned approach for integrated mapping, regions lacking recombination will be not covered. The most direct and effective way to construct cytogenetic maps for organisms with large and complex genomes like onion is to directly localize single-copy genes on chromosomes. With this approach, the LFS gene (Masamura et al. 2012) (Fig. 5.1f, 2b), the EST-based markers (Romanov et al. 2015), SNPs, and RFLP tightly linked to the male fertility restoration gene (Khrustaleva et al. 2016) (Fig. 5.1g, 2a) were mapped on physical chromosome and integrated into genetic maps.

To conclude, the reader may be skeptical about integrated mapping because now with the next generation sequencing, the whole-genome sequencing of any organism became fast and robust. However, scaffold order arrangement and whole-genome assembly remain challenge. Recent publications showed massive discrepancies between in silico assembled version of the genome and nuclear genome. Mostly misassembled genome sequences occurred in the “cold” spot recombination regions (Yang et al. 2014; Karafiátová et al. 2013; Shearer et al. 2014). Therefore, the order and orientation of sequenced scaffolds in pseudomolecules should be corrected using independent physical methods, such as FISH and optical mapping. Optical mapping has been used to improve de novo plant genome assemblies (Tang et al. 2015). This method allows to visualize the locations of the restriction sites or sequence motifs under light microscopes.

5.9 Application of Molecular Cytogenetics in the Onion Interspecific Breeding

Interspecies hybridization plays an important role in bulb onion breeding since its gene pool appeared to be rather depleted within its long history of more than 5000 years of human cultivation (Jones 1983). An ancestral species, A. cepa, was lost together with the depletion of multiple valuable properties. Closely related species might be used as donors of economically valuable properties in bulb onion breeding. Effective identification of alien chromosomes is essential for monitoring the alien genetic material. Molecular markers, however, can only reveal the regions from which they have been derived. A large number of markers that represent different chromosomal regions would have to be used to analyze a complete chromosome. Furthermore, the presence of a marker (or a syntenic group of markers) often does not distinguish whether one copy or multiple copies of a particular chromosome are present in the plant (Dong et al. 1999). Molecular cytogenetic identification of alien chromatin in breeding lines includes determination of both genomic origin and chromosomal specificity of a chromosome or chromosomal segment. GISH provides a direct visual identification of parental genomes in interspecific hybrids and their backcross progenies. GISH is a modification of fluorescent in situ hybridization, when labeled total genomic DNA of one parental species is used as a probe with unlabeled genomic DNA from another species at a higher concentration, which serves as a blocking DNA, hybridizing with the sequences in common with both genomes.

Allium fistulosum is a rich source of desirable traits which are very beneficial for the breeding of new onion (A. cepa) cultivars. The first attempt of a gene transfer from A. fistulosum to A. cepa using interspecific hybridization was performed in 1935 by Emsweller and Jones. The F2 plants proved to be largely sterile, although occasionally seed set occurred upon selfing, while they succeeded to obtain several sterile BC1 plants. Levan (1936) could also obtain F2 plants between A. cepa and A. fistulosum, and only those with spontaneous chromosomal duplication were shown to be fertile. Maeda (1937) revealed that a number of the F2 derivatives and backcross progenies with A. fistulosum were mostly sterile and diploids. It was reported that the F1BC3 plants with A. cepa cytoplasm showed a certain level of fertility and a similar morphology as A. cepa (Hou and Peffley 2000; Peffley and Hou 2000). Pathak and colleagues (2001) succeeded in producing the F3 progenies resistant to stemphylium leaf blight with the high level of pollen fertility (40–80%) and seed set (20–60%). GISH analysis of F2 and advanced generation of interspecific hybrids between A. cepa and A. fistulosum that were relatively resistant to downy mildew was reported by Budylin and coleagues (Budylin et al. 2014). The GISH analysis of its advanced generation revealed that the F5 and BC1F5plants which produced few seeds was amphidiploid with 32 chromosomes. In the same way, it was concluded that a sterile BC1F5 plant was triploid possessing eight A. fistulosum and 16 A. cepa chromosomes and that a partially fertile BC1F5 plant was amphidiploid with 4 recombinant chromosomes. Because colchicine treatment was not used in any generation of hybrids authors suggested the presence of a spontaneous polyploidization. The GISH analysis of recombinant chromosomes allowed to suggest the formation of 2n gametes as a result of second division restitution (SDR) since both sister chromatids of recombinant chromosomes remained in the same gamete. Levan (1936) previously demonstrated polyploid F2 plants as a result of a crossing between A. cepa and A. fistulosum without the artificial chromosome duplication, and explained this phenomenon by 2n-gametes formation in both macrosporogenesis and microsporogenesis. Consequently, it seems that viable F2 zygotes with triploid or tetraploid chromosome set might be formed. Contrary to the studies of Budylin et al. (2014) and Levan (1936), other scientists reported the diploid F1, F2, and BC1 plants of hybrids between A. cepa and A. fistulosum (Emsweller and Jones 1935; Maeda 1937; Peffley and Hou 2000; Hou and Peffley 2000). Levan (1941) tried to explain inconsistencies between his data and the results reported by Emsweller and Jones (1935), and carried out a comparative analysis of meiosis in his hybrids and hybrids reported by them. Levan observed significant differences in chromosomal behavior during meiosis between his hybrids and the hybrids obtained by Emsweller and Jones (1935) and suggested that the origin of parental cultivars played an important role in the polyploidization. The hybrids of Emsweller and Jones (1935) and Maeda (1937) were obtained from crossings between “Yellow Danvers” of A. cepa and the Japanese cultivars of A. fistulosum (“Nebuka” in the first case and “Hidanegin” in the second). Later Peffley and Hou (2000) and Hou and Peffley (2000) obtained diploid BC1 and F1BC3 plants, respectively, as a result of interspecific hybridization by using “Ishikura” as A. fistulosum parent. In the crossing experiments researches of Levan (1936) and Budylin et al (2014), the European cultivars of A. fistulosum were used as a pollinator. Taking all these into consideration, no fertile breeding line possessing a single gene from A. fistulosum in the diploid background of A. cepa had been obtained despite numerous reports on interspecific F1-5 hybrids and its backcross progenies.

To circumvent the aforementioned problems, an idea was born to use A. roylei as an intermediate species to transfer genes from A. fistulosum to A. cepa (de Vries et al. 1992). It was shown that A. roylei crossed readily with both A. cepa (van der Meer and de Vries1990) and A. fistulosum (McCollum 1982). The appropriateness of this strategy was fully confirmed by the GISH analyses of several generations in the bridge-cross (A. cepa x (A roylei x A. fistulosum)) (Khrustaleva and Kik 1998, 2000). The first and second bridge-cross generations were fertile and the recombination between chromosomes of A. cepa and A. fistulosum occurred frequently. The power of GISH analyses in breeding process was clearly demonstrated in producing onion breeding lines resistant to downy mildew (Scholten et al. 2007). The GISH analysis of the advanced generations showed that an A. roylei fragment bearing the gene locus Pd1 was located in the distal end of the long arm of chromosome 3 as reconfirmed by the genetic mapping of van Heusden et al. (2000a). Moreover, the GISH analysis could not detect the large A. roylei segments on both homologous chromosome 3 in 14 downy mildew resistant plants of F1BC5S3 progeny 2348, a progeny that segregated for resistance, and six plants from F1BC5S3 progeny 3591, a progeny that consisted of only resistant plants. This observation allowed to hypothesize the presence of the recessive lethal factor proximally located to the downy mildew resistant gene within the large A. roylei segment. The further developing of molecular marker closely linked to downy mildew resistant gene validated this hypothesis. Thus, the breeding process was complicated by a lethal factor that seems to be expressed only in an A. cepa background. By crossing overs between homoeologus chromosomes, the downy mildew resistance locus and the lethal factor were separated and the remaining small segment of A. roylei harboring Pd1 could be made homozygous without any problems.

5.10 Summary and Perspectives for the Future

To conclude it can be said that the FISH technology together with technical progress in genome sequencing and bioinformatics must be very useful to study the molecular organization of repeat and single-copy sequences along the chromosomes. Summarizing past achievements in the molecular cytogenetic study of alliums; (1) A number of markers (SNP, RFLP, EST, AFLP), genes (45S rDNA, 5S rDNA, LFS, and Pd1) were mapped on the respective physical chromosomes, (2) Several species-specific markers for identification of individual chromosomes were developed, (3) Centromeric DNA sequences for A. fistulosum and centromeric Ty3/gypsy retrotransposons for A. cepa and A. fistulosum were found and (4) The unusual telomeric sequence (CTCGGTTATGGG)n of alliums was discovered (Figs. 5.3 and 5.4).

Fig. 5.3
figure 3

Idiograms of A. cepa chromosomes with indicated position of FISH-mapped repetitive and unique DNA sequences

Fig. 5.4
figure 4

Idiograms of A. fistulosum chromosomes with indicated position of FISH-mapped repetitive and unique DNA sequences

For the future, the development of FISH markers and construction of high-density cytogenetic maps will accelerate the ongoing genome sequencing projects of A. cepa and A. fistulosum. An integrated approach including different sequencing strategies is needed due to problems of whole-genome assembly. The approaches will be comprised of long-size insert libraries, long-read sequencing (e.g., PacBio sequencing), and the Hi-C data on the basis of scaffolding as well as genetic map, independent tools such as cytogenetic mapping and optical mapping (Korbel and Lee 2013; Cao et al. 2016; Chaney et al. 2016), etc. To anchor assembled pseudochromosomes to each arm of the physical chromosomes of A. cepa and A. fistulosum as well as to determine their north–south orientation, at least 32 probes, two probes per pseudochromosome should be developed. Such a cytogenetic map will contribute the progress of Allium breeding via effective map-based cloning and accurate genome assembly. The development of robust cytogenetic markers will extend our knowledge of Allium genome organization and evolution and will fill the gaps between genome sequencing and sub-chromosomal measurement data. Furthermore, the integration of linkage and cytogenetic maps will extend our scanty knowledge of recombination rates and patterns in higher plants and, probably, will shed light on several issues through in-depth studies; (1) Why do the recombination sites localize adjacent to the centromeres of A. fistulosum chromosomes?, (2) Why do the recombination sites distribute randomly on A. cepa chromosomes and (3) Why is the recombination frequency of plants higher than that of animals. In 1913, Alfred Sturtevant constructed the first genetic map on the basis of Morgan’s theories of crossing-over. James Watson and Francis Crick had solved the three-dimensional structure of DNA in 1953. Today, we need to solve a big puzzle of the organization and function of whole-genome. Molecular cytogenetics will play an essential role as a bridge between genomics, classical genetics, and other biological disciplines.