Introduction

Cytoplasmic male sterility (CMS) is a trait in plants where viable pollen cannot be produced. CMS is maternally inherited and occurs in more than 150 higher plant species and is important in the evolution of gynodioecious species (Dufay et al. 2007). CMS has been extensively exploited for producing hybrids in a wide variety of crops (Touzet and Meyer 2014; Chen and Liu 2014) showing heterosis. Heterosis refers to performance of hybrids that is superior to that of the parental lines from which they are derived (Baranwal et al. 2012).

Stamen-defective CMS is determined by variation of mitochondrial genes, resulting in a lack of compatibility between the nucleus and the cytoplasm (Tang et al. 2017). Chimeric mitochondrial open reading frames (ORFs), which are generated by rearrangement and recombination of mitochondrial genomes, are the primary or only cause of CMS (Horn et al. 2014). In the mitochondrial genome, chimeric genes are often co-transcribed with genes encoding respiratory chain components, producing aberrant transcripts or encoding toxic proteins that interfere with the normal functioning of mitochondria at critical stages of anther development, resulting in pollen sterility (Touzet and Meyer 2014).

In tobacco (Nicotiana tabacum), over 30 male sterile lines have been developed, all of which were derived from somatic protoplast fusion or the traditional breeding to result in interspecies hybridization between tobacco and other Nicotiana species, such as N. debenyi, N. megalosiphon, N. suaveolens, N. bigelovii, N. hesperis, N. rustica, N. glauca, and N. repanda. sua-CMS is the only sterile cytoplasm thus far identified as having no negative effects on growth or quality (Tong 1997) and is, therefore, widely used to produce hybrid tobacco. At present, most of the CMS lines in China, America, Japan, and Brazil used in hybrid tobacco are sua-CMS (Cao and Yu 2017).

The sterile cytoplasm of the sua-CMS line was developed by somatic fusion between tobacco and N. suaveolens in the 1950s. Ma et al. (2005) confirmed that there were no significant differences between the sua-CMS lines and the homogenous maintainer lines in terms of biological characteristics, agronomic traits, or disease resistance. The sua-CMS flower is similar to that of the maintainer lines in all traits (shape, sepal morphology, petal color, and pistil morphology) except male sterility. Cytological studies of sua-CMS lines showed a lack of male reproductive ability between primordial differentiation of the stamens and the formation of sporogenous cells (Tong 1997).

Development and identification of different CMS types are necessary for sustainability of heterosis in a crop species. Cytology, protein markers, and DNA markers are often used to identify CMS lines. Some cytological studies about anther development in tobacco CMS have been reported (Pollak 1992; Farbos et al. 2001; Fitter et al. 2005). However, anther development, especially in male sterile plants, is often affected by temperature, photoperiod, or moisture (Yuan et al. 1993; Xue et al. 2018). Molecular markers based on DNA polymorphisms are relatively stable and are widely used to identify CMS lines (Bonhomme et al. 1992). The associated genes for CMS are often used as specific markers to identify different sterile cytoplasms. In the sua-CMS line of tobacco, some SNPs located in mitochondrial genes were identified (Zhao et al. 2009). The mitochondrial genome of the sua-CMS line has been sequenced (Li et al. 2016), allowing for variations in ORFs associated with sua-CMS to be developed.

The sua-CMS lines of tobacco are used globally to produce male sterile plants and increase heterosis, yet a little work has been done to characterize the underlying genes. Description of the genetic basis of these lines could assist with breeding efforts and the development of new lines. In this study, six specific ORFs of the tobacco sua-CMS lines were characterized and the expression patterns of the ORFs were analyzed.

Materials and methods

Plant materials

The plant materials in this study included seven types of CMS (sua-, glu-, rep-, rus-, tab1-, tab2-, and tab3-) and their maintainer lines. Each CMS type included a CMS line (BC5F1–BC10F1) and three BC1F1 offspring obtained from the male sterile line and the three common maintainer lines. In addition, two sua-CMS lines that were backcrossed more than ten generations with their corresponding maintainer lines and three F1 hybrids that harbored sua-sterile cytoplasm were also used to identify genes unique to sua-CMS. The material characters and types of CMS used in this study are listed in Table 1. All plant materials were grown in the experimental field of the Tobacco Research Institute in Qingdao, Shandong, China in 2017.

Table 1 Characters and types of tobacco plant materials evaluated in this study

Nicotiana accessions (284) were obtained from the Chinese Tobacco Germplasm Repository (Tobacco Research Institute, Chinese Academy of Agricultural Sciences, Qingdao, Shandong, China; Table S1). These belonged to three subgenera: subgenus Nicotiana included 201 flue-cured, 42 sun-cured, 12 burley, three cigar, and nine oriental tobacco accessions, and two wild-type species; subgenus Rustica included nine rustica tobacco accessions and one wild-type species; and subgenus Petunioldes included five wild-type species. These plant materials were grown in the experimental field of the Tobacco Research Institute in Qingdao, Shandong, China in 2014–2016.

Observation of stamen morphology and scanning electron microscopy of pollen

Around 10:00 AM in sunny weather in July 2017, emerging flowers were sampled from fully flowering plants of the sua-CMS line msZhongyan100 (sZY) and the maintainer line Zhongyan100 (ZY). Petals and sepals were carefully peeled back to observe the morphology of the stamens. Mature anthers of sZY and ZY were collected and fixed in a glutaraldehyde fixation solution. The pollen grains were sputter-coated with platinum particles and examined using a scanning electron microscope (JEOL: JSM-840, Tokyo, Japan).

Pollen vitality assay

To test pollen viability, mature anthers were stained with fluorescein diacetate (FDA) according to the methods of McCabe et al. (1997). Anthers were cut with a sharp blade on a glass slide in a few drops of FDA working solution and pollen grains were released. After incubation in the dark for 5 min, pollen grains were observed using a microscope (Nikon Eclipse 80i, Tokyo, Japan) equipped with a charge-coupled device (CCD) for photography.

Annotation and selection of ORFs

The potential ORFs in mitochondrial genomes were identified using the Artemis software (Rutherford et al. 2000). ORF-Finder (http://www.ncbinlmnihgov/gorf/gorf.html) and tRNAscan-SE (Lowe and Eddy 1997) were also used to further define ORFs of at least 210 nt. Common ORFs with > 99% shared identity between sZY and ZY were identified using blastn and blastx (Altschul et al. 1990), and were subsequently discarded. To improve identification accuracy, the unique ORFs in the mitochondrial genome of sZY were compared with additional plant mitochondrial genome annotations in NCBI using blastn and blastx.

Isolation of total DNA and PCR amplification of mitochondrial gene-specific primers

Total DNA was isolated from young leaves using the CTAB method (Doyle and Doyle 1990). We designed 27 primer pairs to amplify 34 unique ORFs of sua-CMS in sua-, glu-, rep-, rus-, tab1-, tab2-, and tab3-CMS types and in fertile tobacco. Each primer pair contained one, two, or three complete ORFs according to the positions of the ORFs. Primer pairs used in this study are listed in Table 2.

Table 2 Primer sequences used for amplification of open reading frames (ORFs) in tobacco hybrids

PCR was performed in 10 µL reaction mixtures containing 100 ng total DNA, 1 unit Taq DNA polymerase (Thermo Scientific, Waltham, MA, USA), 0.2 mM dNTP mix, 1 µL of 10× Taq buffer with (NH4)2SO4, and 0.5 µM of each primer (Tsingke Biotech, Beijing, China). The PCR protocol was as follows: 94 °C for 5 min; 30 cycles of 94 °C for 30 s; 50–56 °C for 30 s according to the annealing temperature of each primer pair, 72 °C for 45 s, and a final extension at 72 °C for 7 min. PCR products were detected by electrophoresis on 1% agarose gels, followed by staining with ethidium bromide.

Cloning and sequencing

The target DNA fragments were recovered using a PCR purification kit (TIANGEN Biotech, Beijing, China). The recovered DNA fragments were ligated into pMD18-T vectors (TaKaRa, Shiga, Japan), and the products were transformed into competent cells (DH5a). Single-colony cultures containing the inserted target fragments were verified by PCR and sequenced (Tsingke Biotech).

Screening and characterization of mitochondrial ORFs of sua-CMS

To assay the stability of the specific mitochondrial genes in sua-CMS, we amplified the mitochondrial small region (m-sr) fragments in three different backcross populations (BC1F1s), three F1 hybrids with sua-CMS sterile cytoplasm, and two sua-CMS lines (BC10F1). BC1F1s were sZY/Speight G-28//Speight G-28, sZY/K326//K326, and sZY/NC82//NC82, and the six different types of CMS (glu-CMS, rep-CMS, rus-CMS, tab1-CMS, tab2-CMS, and tab3-CMS) had three similar nuclear backgrounds. The F1 hybrids were msK326 × RW, msZY90 × CT107 and ms317 × kx10.

We further amplified m-sr fragments in fertile tobacco accessions. These accessions belonged to the three subgenera (see “Plant materials”) and included flue-cured, sun-cured, burley, cigar, oriental, and rustica tobacco types, as well as wild-type species.

Transcript analysis

Total RNA was isolated from tobacco leaves, floral buds, anthers, and roots using a total RNA reagent kit (Thermo Scientific). After removing any possible genomic DNA contaminants using RNase-Free DNase I (Promega, Madison, WI, USA), cDNA synthesis was performed using a PrimeScript first-strand cDNA synthesis kit (TaKaRa). The RT-PCR reaction was terminated after 24 cycles, and tobacco Tublin was used as the internal control. Primer sequences of the specific ORFs (Table 3) were designed using Premier Primer 5.0. All products were visualized using 1.5% agarose gels.

Table 3 Primer sequences used in transcript analysis

qRT-PCR was performed on an ABI 7500 (Thermo Scientific), using SYBR® Premix Ex Taq™ (TakaRa) according to the manufacturer’s instructions. The final volume of the PCR mixture was 20 µl, including 10 µl of the SYBR Green Master mix reagent, 7.8 µl of sterile water, 0.4 µl of DyeII, 1 µl of cDNA, and 0.4 µl (2 mM) of each real-time PCR primer. The following cycling conditions were used: 95 °C for 30 s followed by 40 cycles of 95 °C for 5 s, and then 60 °C for 34 s. The expression patterns of ORFs were determined relative to tobacco Tublin, and relative expression levels were calculated. All assays were repeated at least three times.

Results

Stamen morphology and pollen vitality of sua-CMS

ZY exhibited five full anthers, and the anthers were as long as the stigma, which is necessary for normal self-pollination (Fig. 1a). In contrast, the sua-CMS line sZY had shortened filaments and stigma-like or completely degraded anthers (Fig. 1d). Unlike its maintainer line ZY, in which the anther dehisced when the flower opened, the anther of sZY did not change in shape or color throughout the flowering process, even when the flower withered. The pollen grains of ZY were nearly round and had visible grooves based on scanning electron microscopy (SEM) (Fig. 1b) and they were viable (Fig. 1c). In sZY, no pollen grains were visible, other than some shards in the anther observed under SEM (Fig. 1e). sZY did not produce any viable pollens (Fig. 1f), and thus, it was considered fully male sterile.

Fig. 1
figure 1

Anther morphology, pollen structure, and pollen viability for fertile tobacco Zhongyan 100 (ac) and the sua-CMS line msZhongyan100 (df) (a/d anther morphology; b/e pollen grain; c/f pollen viability)

Screening and characterization of mitochondrial sequences of sua-CMS

There were 393 and 417 mitochondrial ORFs (> 210 codons) in ZY and sZY, respectively. Thirty-four ORFs were unique to sZY according to blastn and blastx analyses. We designed 27 primer pairs to amplify all 34 unique ORFs across seven CMS types and fertile tobacco, which comprised sZY (sua-CMS), ZY (fertile tobacco), 86-6 (glu-CMS), 98-43 (rep-CMS), 200-18 (rus-CMS), L1 (tab1-CMS), 4-1 (tab2-CMS), and CH03 (tab3-CMS). The six ORFs—orf82, orf103, orf115a, orf91, orf115b, and orf100—located at 122,845-123,313 bp (m-sr1), 342,705-343,743 bp (m-sr2), and 414,430-415,454 bp (m-sr3) of the mitochondrial genome of the sZY (Accession no. KR071121), were only amplified in sZY. The other 28 ORFs were also amplified in some of the other six CMS types and therefore were not considered unique to sua-CMS (Table 4). The specific amplification products in sZY were cloned and sequenced. The sequences of the amplification products were completely consistent with the corresponding m-sr fragments. When the three m-sr fragments were analyzed via BLAST with the fertile tobacco mitochondrial genomes (Accession no. BA000042 and KR780036), we found that six ORFs in the three m-sr sequences were chimeric (Fig. 2). orf82 in m-sr1 was 246 bp in length and contained six small fertile tobacco mitochondrial DNA fragments, accounting for 51.6% of orf82. The other sequences shared no homology to fertile tobacco mitochondrial genomes. orf103, orf115a, orf91, orf115b, and orf100 were between 273 and 345 bp in length, and each ORF was also determined to contain 5–6 small fertile tobacco mitochondrial DNA fragments. The similarity sequences accounted for 21.7, 24.9, 25.6, 39.9, and 27.4% of orf103, orf115a, orf91, orf115b, and orf100, respectively, and the other sequences shared no homology to fertile tobacco mitochondrial genomes.

Table 4 Open reading frames (ORFs) amplified in msZhongyan100 (sua-CMS), Zhongyan100, 86-6 (glu-CMS), 98-43 (rep-CMS), 200-18 (rus-CMS), L1 (tab1-CMS), 4-1 (tab2-CMS), and CH03 (tab3-CMS)
Fig. 2
figure 2

Schematic structure of six ORFs and their flanking regions. Solid lines indicate homologous sequence; dotted lines indicate non-homologous sequence

Stability and specificity of the mitochondrial sequences

The amplification results showed that the specific mitochondrial fragments were consistently amplified in all plants with sua-CMS background, and amplification was consistent with changes in the nuclear genomes (Fig. 3). These m-sr fragments were not amplified in BC1F1s of the other six tobacco CMS types or in their backcrossed male parents (Speight G-28, K326 and NC82), suggesting that the ORFs were related to sua-CMS. The three m-sr fragments were not amplified in any of the fertile tobacco accessions (Fig. S1).

Fig. 3
figure 3

Amplified m-sr fragments present in different types of CMS lines with the same nuclear background. a m-sr1 amplification; b m-sr2 amplification; c m-sr3 amplification. Number above each lane matches the corresponding plant materials (listed in Table 1). M DNA marker DL2000

Transcript analysis of ORFs

In sZY, all six ORFs were expressed in the four plant tissues. They were expressed highly in floral buds and anthers and to a lesser extent in roots and leaves. In contrast, no RT-PCR products of any ORFs were observed in the four tissues of the maintainer line ZY (Fig. 4). The qRT-PCR results validated those of RT-PCR (Fig. 5). Both methods revealed similar changes in the expression of ORFs in sZY.

Fig. 4
figure 4

Expression of the specific ORFs in the sua-CMS line msZhongyan100 and its maintainer line Zhongyan100 in four plant tissues

Fig. 5
figure 5

Relative expression of the six specific ORFs in the sua-CMS line msZhongyan100 in four plant tissues

Discussion

In this study, we identified six unique ORFs in the tobacco line sua-CMS, all of which were located in three small regions of the mitochondrial genome of sua-CMS. These ORFs were all chimeric and showed no homology with mitochondrial genomes of the tested fertile tobacco lines. We could not identify the origin of 48.4–78.3% of the sequences in these ORFs (Fig. 2). In plants with CMS, the formation of new ORFs is commonly due to recombination or rearrangement of the mitochondrial genome and new ORFs are often associated with CMS (Chen and Liu 2014). Mitochondrial genomes of different species vary greatly, because plant mitochondrial DNA (mtDNA) has extensive genomic variation and the non-coding sequences of mtDNA are not conserved across species (Gualberto and Newton 2017). Comparison of the mitochondrial genomes of Brassica napus and Arabidopsis thaliana (Unseld et al. 1997; Handa 2003) in the Brassicaceae family showed that one-third of the mtDNA of B. napus had no homology with the mtDNA of A. thaliana. Much of the mtDNA of Silene noctiflora and S. conica, which have many circular chromosomes (Sloan et al. 2012), lacked identifiable genes and could be rapidly gained or lost (Wu et al. 2015). When plants undergo interspecies somatic hybridization, mitochondrial genomes of the hybrids coexist with those of both parents and show recombination or rearrangement (Cardi et al. 1999; Zubko et al. 2001). The mitochondrial genomes differ significantly between CMS lines developed by interspecies hybridization and their parents. In the CMS line TK81-MS of sugar beet (Beta vulgaris L), a total of 68 kb (about 13.6% of the total genome) were missing compared to the normal sugar beet line, TK81-O (Satoh et al. 2006). The variation in the mitochondrial genome of sua-CMS observed in this and other studies is, therefore, not unusual. The sua-CMS line of tobacco was developed by somatic hybridization between N. tabacum and N. suaveolens. Sequencing of the mitochondrial genomes confirmed that the mtDNA of the sua-CMS lines was 25% larger than that of the maintainer line or of other fertile tobacco lines, and most of this increase was due to non-coding repeat sequences (Sugiyama et al. 2005; Li et al. 2016).

Mitochondria are the location of cellular energy metabolism, in which the electron transport chain generates ATP through proton gradients and maintains the vital activity of cells (Millar et al. 2011). CMS in plants is associated with disturbance of the electron transport chain, impairing energy production. The energy impairment does not affect vegetative development or female flower development. The leading hypothesis to explain this phenomenon is that the developing microspores consume much more energy than any other tissues (Hanson and Bentolila 2004). Thus, genes encoding electron transport chain proteins and ORFs associated with CMS are characterized by high expression in floral tissues (Zhao et al. 2016; Yang et al. 2017). In addition to mitochondrial energy deficiency, CMS protein cytotoxicity and premature tapetal programmed cell death (PCD) in the anther also explain some CMS systems (Chen and Liu 2014), which are encoded or induced by ORFs associated with CMS. These ORFs are specific or highly expressed in the anther. In this study, the six described ORFs were unique genes of sua-CMS tobacco and were highly expressed in the floral buds and anthers (Figs. 4, 5). These results indicated that the ORFs may be related to the mitochondrial function of sua-CMS. Further research out of our lab will be conducted to confirm this hypothesis.

The most notable feature of CMS is anther abortion. Morphological and cytological identification of CMS lines could be used to classify sterile cytoplasm (Gonzalez-Melendi et al. 2008; Yang et al. 2008). Variation in the mitochondrial genome is an important driver of plant CMS formation (Hanson and Bentolila 2004; Gualberto and Newton 2017). Based on the mitochondrial genomic changes of sterile cytoplasms, CMS types have been identified precisely and accurately at the molecular level. Kim et al. (2007) developed a combination of molecular markers based on mtDNA rearrangement of radish (Raphanus sativus L.), which is now used for the reliable classification of radish cytoplasms. A novel cytoplasmic male sterility in radish was subsequently identified using these molecular markers (Lee et al. 2008). More than 60 CMS lines have been developed from interspecies, intersubspecies, and intervarieties of Oryza plants with the AA genome in rice. Based on PCR amplification of 264 random primers in two different backcross populations of rice, a mitochondrial SCAR marker of the HL-CMS/Rf system was developed (Tan et al. 2015). Heng et al. (2017) identified mitotype-specific sequences (MSSs) by comparative analysis of six mitochondrial genomes in B. napus and developed 12 mitotype-specific markers that could distinguish between six different mitotypes. These markers can provide guidance for classifying different mitotypes in B. napus breeding, and can be used to improve the efficiency of cross-breeding and heterosis utilization in cruciferous vegetables. We identified six specific ORFs of tobacco sua-CMS by mitochondrial genome analysis and amplification in different tobacco CMS types and in fertile Nicotiana accessions. These specific genes will be useful for molecular markers to distinguish sua-CMS from other types of CMS.

Conclusion

In this study, we obtained 34 unique ORFs in tobacco sZY by BLAST analysis of the mitochondrial genome ORFs of the sua-CMS line sZY and its maintainer line ZY. Six ORFs were amplified in plants with sua-sterile cytoplasm, including the F1 hybrids, BC1F1s offspring, and CMS lines, and were not amplified in plants with glu-, rep-, rus-, tab1-, tab2-, and tab3- tobacco sterile cytoplasms or the 284 fertile tobacco accessions. This result indicated that they were sua-CMS sterility-specific genes, which can be used to identify sua-CMS lines. Transcriptional analysis showed that the six ORFs were highly expressed in anthers and flower buds, but were expressed at low levels in vegetative organs, including leaves and roots, in sZY, confirming that these ORFs play a role in male sterility in sua-CMS.

Author contribution statement

AY and FL conceived and designed the study. YZ and ZL carried out the experiments. YS and GL contributed materials. YZ, AY, and FL analyzed the data and wrote the manuscript. All authors read and approved the final manuscript.