Introduction

Capsicum annuum belongs to the Solanaceae family. There are five domesticated and approximately 25 wild species in the economically important genus Capsicum (Cheema and Pant 2013). The fruits of Capsicum species are used as spices, and C. annuum is among the most consumed vegetable crops worldwide (Kwon and Kim 2009). Capsicum peppers were introduced to Asia from South America during the sixteenth century via trade routes (Castro-Concha et al. 2014). Currently, the largest producer is China (Paran et al. 2007). Moreover, there are two distinct groups present in the genus: species that have 12 pairs of chromosomes (2n = 24), and species that have 13 pairs of chromosomes (2n = 26) (Rohami et al. 2010). C. annuum is diploid with a genome size of 3.26 Gb (Qin et al. 2014).

Recent advances in genomics have shown that large proportion of eukaryotic genomes are repetitive elements (REs). These sequences could be up to 90% of the genome size (Mehrotra and Goyal 2014; Pertea 2012). The REs can be classified into two broad classes: satellite sequences, which include micro-satellites and mini-satellites, and dispersed repeats like transposable elements (TEs) (Wang et al. 2012). REs play very important roles in species evolution because they are involved in several processes, such as the movement, pairing, recombination, and arrangement of chromosomes, regulation of genes expressions, and genic responses to environmental stimuli (Mehrotra and Goyal 2014). The TEs play key roles in adaptation, diversification, and speciation (Kalendar et al. 2000; Schrader et al. 2014). One TE, the long terminal repeat (LTR) retrotransposon represents one of the largest portions of DNA component (Liu et al. 2018), such as the proportions of LTR are greater than 75%, 41.35%, and 68.68% in maize (Schnable et al. 2009), wheat (Jia et al. 2013), and coix (Cai et al. 2014), respectively. In general, LTR retrotransposons are distributed along the chromosomes and found in the centromere region (Li et al. 2017), playing significant roles in gene regulation and evolution (Galindo-González et al. 2017).

The latest genome assembly of C. annuum showed that 80.9% comprised repeats. Among those REs, the majority was the long terminal repeat (LTR) which represents approximately 70.3% of the genome, and the most of the LTR were Gypsy elements (Kim et al. 2014; Qin et al. 2014). In a previous study, an unknown type of repeat CaLUR with a long unit length (18–24 kb) was identified and randomly distributed in C. annuum pachytene chromosomes. However its functional role was unclear (Park et al. 2012).

Ribosomal DNAs are essential genetic elements involved in ribosome function (Richard et al. 2008). They include two major subfamilies: the 5S and 45S (18S–5.8S–25S) rDNAs (Galian et al. 2012). It has been observed that the copy number and chromosome distribution of rDNAs could be changed rapidly (Muratović et al. 2010). Because of this property, rDNAs have been used as molecular markers for genome mapping (Witsenboer et al. 1997) and cytogenetic research (Gupta and Varshney 2000) in different plants.

Characterizing the types of repeats in a genome and understanding their chromosomal distribution and potential function will allow the improvement of crop breeding programs. A rapid and efficient approach to identifying major genomic REs is through the RepeatExplorer pipeline (Novák et al. 2013), using low-coverage whole-genome sequencing (WGS) datasets obtained either through de novo sequencing or from public repositories. Here, we performed a genome-wide analysis of major repeats in C. annuum using RepeatExplorer and performed FISH on one novel satellite DNA.

Materials and methods

Plant samples

Seeds of C. annuum L. (IT032384) were collected from the National Agrobiodiversity Center, Republic of Korea. The seeds were germinated in a plastic cup filled with horticulture-grade soil. Root tips were harvested when the seedlings were about 5 cm tall. The root tips were pre-treated with 2 mM 8-hydroxyquinoline solution for 5 h at 18 °C. The samples were then fixed in Carnoy’s fixative (3:1 v/v absolute ethanol: glacial acetic) overnight. Finally, the samples were stored in 70% alcohol at 4 °C for later use.

Genomic DNA extraction

Two grams of young seedling leaves of C. annuum were collected for DNA extraction. The genomic DNA was extracted using a modified cetyltrimethylammonium bromide (CTAB) protocol described by Allen et al. (2006). The concentration of the obtained genomic DNA was measured using Colibri Microvolume Spectrometer (Titertek Berthold, Pforzheim, Germany).

Identification of satellite DNA

The C. annuum WGS reads (SRR653476) were downloaded from the National Center for Biotechnology Information (https://www.ncbi.nlm.nih.gov/Traces/study/?acc=BGI_Zunla_1). Read quality trimming, sampling of 0.05 × of the C. annuum genome, and read cluster were carried out using the Tandem repeat analyzer (TAREAN) (Novák et al. 2013) workflow in RepeatExplorer (Novák et al. 2017).

Probe preparation

The pre-labeled oligo nucleotide probes (PLOPs) for 5S and 45S rDNAs and Arabidopsis-type telomeric repeats were designed by Waminal et al. (2018) and provided by the Bioneer Corporation (South Korea). The probe for Ca167TR was prepared by PCR amplification using primers (Table 1) designed using Primer3Plus (http://www.primer3plus.com/cgi-bin/dev/primer3plus.cgi). A TaKaRa Ex Taq® DNA Polymerase kit (TaKaRa RR001B, Kusatsu, Japan) was used for PCR amplification of Ca167TR. The PCR mixture contained 150 ng of template DNA, 2 μM each of forward and reverse primers, 0.2 mM dNTP mixture (TaKaRa), 2.5 units of Ex Taq polymerase (TaKaRa), 5 μl of 10 × Ex Taq buffer (TaKaRa), and nuclease-free water (Sigma-Aldrich RNBG3073, Darmstadt, Germany) up to a final volume of 50 μl. The PCR amplification was carried out as follows: initial denaturation at 95 °C for 5 min; followed by 35 cycles of denaturation at 95 °C for 30 s, annealing at 55 °C for 20 s, and extension at 72 °C for 15 s; and a final extension for 5 min at 72 °C. The PCR amplicons were labeled with biotin-16-dUTP according to the manufacturer’s protocol (Biotin-Nick Translation Mix, Roche).

Table 1 Primers used in this study

Slide preparation

Chromosome preparation was carried out according to a previously descibed method (Waminal et al. 2011) with some modifications. Briefly, 2 mm long root tips were dissected and digested in 1:2 (%) ratio of pectolyase and cellulase for 90 min at 37 °C. The root tips were then washed with distilled water. The meristematic tissue was pipetted into the Carnoy’s solution (3:1 v/v absolute ethanol:glacial acetic acid), squashed, and simultaneously vortexed for 15 s. The suspension was then centrifuged at 5000 rpm for 5 min, and the supernatant was decanted carefully. The protoplast was re-suspended in acetic acid:ethanol (9:1 v/v) solution. The final suspensions were mounted on a 70 °C pre-warmed glass slide in a humidity chamber and air dried at room temperature (23–25 °C).

Fluorescence in situ hybridization

Fluorescence in situ hybridization (FISH) with PLOPs was based on the method described by Waminal et al. (2018). Briefly, a total 40-µl FISH mixture of 25 ng each of 5S and 45S rDNA and telomeric repeat PLOP probes, 50% formamide, 10% dextran sulfate, and 2 × SSC was prepared for each slide. The FISH mixture was placed on a chromosomal DNA slide and denatured at 80 °C on the slide heater for 5 min. Then, the hybridization was proceeded at room temperature in a humid chamber for 30 min in the dark. The steps of stringent washes were: 2 × SSC at room temperature for 10 min, 0.1 × SSC at 42 °C for 25 min, and 2 × SSC at room temperature for 5 min. Then, the slides were dehydrated through an ethanol series (70%, 90%, and 100%) at room temperature for 3 min each and dried in the dark. Finally, the slide was counterstained with 4′,6-diamidino-2-phenylindole (DAPI) (Roche 70217321, Hoffmann-La Roche Ltd., Basel, Switzerland) at a ratio of 1:100 DAPI (100 µg/ml stock) to Vectashield (Vector H1000, Vector Laboratories, Burlingame, CA, USA). Slides were observed under an Olympus BX53 fluorescence microscope (Olympus Co., Tokyo, Japan) with a built-in CCD camera (CoolSNAP, Photometrics, Tucson, AZ, USA), using an oil lens (100 × magnification).

Chromosomes were washed and reprobed with Ca167TR following the method described by Lim et al. (2005). In order to remove the oil and coverslip, the slides were washed in 70% ethanol and 2 × SSC for 5 min and 10 min, respectively. Then, the slides were immersed in 4 × SSC containing 0.2% Tween-20 for 1 h, incubated in 2 × SSC containing 70% formamide for 5 min at 80 °C, and dried in the air for re-probing. The biotin-labeled Ca167TR probe was detected with streptavidin-Cy3 conjugate (Zymed Lab., USA).

Results and discussion

The RepeatExplorer analysis generated eight major repeats, which included one high confidence satellite repeat, three low confidence satellites repeats, three LTRs, and one rDNA (Table 2). We identified one read cluster (CL240) representing a novel high confidence 167-bp satellite repeat with a genomic proportion of 0.011%, which we named Ca167TR (Table 3). We developed a FISH probe of this tandem repeat to test its potential as a cytogenetic marker for pepper.

Table 2 List of annotations and genome proportions of major repeat clusters
Table 3 General information of a novel high confidence satellite repeat Ca167RT of C. annuum L.

Cytogenetic studies of C. annuum, including the publications of chromosome number, karyotype by fluorescence staining (CMA/DA/DAPI), and FISH with rDNA and telomeric probes have been previously conducted (Hwang et al. 2010; Lippert et al. 1966; Moscone et al. 2007; Pickersgill 1991). Here, we observed a 2n = 24 chromosome number in C. annuum, like other previous studies (Cheema and Pant 2013; Dixit 1931; Huskins and La-Cour 1930). Our FISH analysis on sporophytic metaphase chromosomes showed one and three signals of 5S and 45S rDNA, respectively (Fig. 1), in accordance with results reported by Hwang et al. (2010). One of the 5S rDNA signals hybridized in the subtelomeric region of the short arm of chromosome 9, while one pair of major and two pairs of minor signals of 45S rDNA were detected in the subtelomeric region of the short arm of chromosomes 12 and 4, and the long arm of chromosome 10, respectively. Two pairs of Ca167TR repeat signals were detected in the subtelomeric region of the short arm of chromosome 3 and the long arm of chromosome 4. Telomeric repeat signals could be detected in the telomere region of almost both end of all chromosomes (Fig. 2; Table 4).

Fig. 1
figure 1

FISH metaphase chromosomes of C. annuum L. a 5S rDNA (green), b 45S rDNA (red), c Ca167TR repeat (magenta), d telomere (yellow) signals, and e merged. The red and magenta arrows in b, c and e indicate the weak 45S rDNA and Ca167TR repeat signals, respectively. Scale bar = 10 µm (color figure online)

Fig. 2
figure 2

FISH karyogram of C. annuum L. showing the merged signals (E) of 5S rDNA (green, A), 45S rDNA (red, B), Ca167TR (magenta, C), and Arabidopsis-type telomere (yellow, D). The red and magenta arrows in B, C, and E indicate the weak 45S rDNA and Ca167TR repeat signals, respectively. Scale bar = 10 µm (color figure online)

Table 4 Summary of the chromosomal distribution tandem repeats in C. annuum L.

Hwang et al. (2010) and Romero-da Cruz et al. (2017) performed FISH-based karyotype analyses of C. annuum with rDNAs signals that could reliably distinguish two chromosomes (5S and 45S major signals). With the newly found tandem repeat Ca167TR, aside from chromosome 9 (5S signal) and 12 (45S major signal), we could reliably distinguish three more chromosomes, chromosome 3 (Ca167TR), 4 (45S minor and Ca167TR signals), and 10 (45S minor signal). Thus, Ca167TR is confirmed to be a good cytogenetic marker for C. annuum.

The genomes of higher plants comprise a mass of repetitive elements (REs). These REs have a well-defined impact on chromosome composition, which is the main reason for genome size variation (Garrido-Ramos 2015; Piegu et al. 2006). The genome size of C. annuum was approximately four times larger than that of tomato (760 Mb) (Consortium TG 2012) and potato (727 Mb) (Xu et al. 2011) genomes, mainly due to the large proportion of REs in both heterochromatic and euchromatic regions (Kim et al. 2014). The genome of C. annuum contained 80.9% of repeats (Qin et al. 2014). Usually REs like tandem repeat could be richly detected in telomere and centromere regions (Rao et al. 2010). Similarly, the newly found Ca167TR repeat of C. annuum was detected in the subtelomeric regions (Fig. 2).

Here, we have confirmed the usefulness of Ca167TR as a cytogenetic marker for C. annuum. Further FISH experiments are needed in order to localize other major repeats identified in this study. Additionally, whether these repeats affect the spiciness and stress response of C. annuum by interfering with the capsaicin biosynthetic pathway and the biotic and abiotic stress signalling pathways remains unknown and therefore, are interesting topics for future research.