INTRODUCTION

Modern technologies of genetic modification of plant genomes allow the successful transfer of the genes of different heterologous origin in order to improve economically valuable traits in important agricultural crops. Traditional approaches of genetic modification (agrobacterial transformation, bioballistics, electroporation, etc.) are associated with random distribution of exogenous DNA sequences over the genome, while new technologies of genomic editing technologies using a CRISPR/Cas9 system allow researchers to integrate foreign genes into preselected target genome regions. The success of genetic modification of the plant genome is largely determined by a high and stable level of expression of transferred target genes.

On the basis of extensive experimental material on the creation of genetically modified plants (accumulated over the past two decades), it becomes apparent that the level of the expression of transgenes varies significantly between independently obtained initial transformants and their descendants [1–3] and is determined by many factors, among which the number of copies and the place of incorporation of foreign insertion into the genome are the most important [34]. Thus, for example, the further “fate” of the transgene integrated into the hypermethylated region of the plant genome will be associated with a high degree of probability with its inactivation during the transmission to descendants [3–5]. The incorporation of several copies of exogenous DNA in one region of the genome with the formation of complex insertions (including rearrangements in the form of inverted sequences) is accompanied by a decrease or complete loss of the expression of target genes [5–7].

The formation of spontaneous complex insertions during genetic transformation of the plant genome is undoubtedly of a great interest to researchers, since most of the plant genome evolutionarily formed in the form of clusters of genes and gene families. With the improvement of economically valuable traits in plants, the events of integration in the plant genome in the form of tandem insertion is eliminated by researchers owing to the instability of expression and silencing of the target gene. However, such plants are of great interest as models for identifying the reasons and mechanisms that trigger the processes of inactivation of foreign genes.

Two main mechanisms leading to silencing of the transgene in the genome of a genetically modified plant are known: impaired reading of the target transcript and degradation of already synthesized mRNA in the nucleus/cytoplasm [8, 9]. Small interfering RNAs that are formed when a complex of enzymes cleaves double-stranded aberrant RNA transcripts read from sequences of a genetic construction emerge as the key factor and trigger in this process [10–13]. Violations in the stability of the expression of transgenes in plants lead to a decrease in the level of their expression and complete loss, as well as to a mosaic pattern of expression at the level of somatic tissue cells [14].

Two epiallelic lines (Nu5 and Nu6) of N. tabacum L. transgenic tobacco plants previously created by us (differing from each other in the frequency of formation of descendants with mosaic expression of the E. coli neomycin phosphotransferase II (nptII) selective gene) are a convenient model for identifying the reasons and mechanisms of mosaic pattern of expression at the level of somatic tissues. Both lines were obtained as a result of directional selection for three consecutive generations among descendants from self-pollination of the initial Nu21 tobacco transformant, in the genome of which complex T-DNA insertion (including two full-size T-DNA copies and one truncated copy located in the reverse orientation) was randomly integrated. When creating the Nu5 line, selection was carried out to reduce the frequency of mosaicism for the nptII gene at the phenotype level, whereas when creating the Nu6 line, on the contrary, selection was carried out to increase it [7, 15]. According to the results of hybridological analysis, statistically significant differences in the stability of expression and manifestation of the nptII gene between epialleles were established, despite the insertion into the same region of the plant genome and, correspondingly, the same composition of the nucleotide sequence adjacent to T-DNA. The hybrids and descendants from self-pollination of the Nu5 line were relatively stable with respect to the expression of the selective gene against the background of low frequencies of detection of mosaics among descendants, while an increase in the gene inactivation and high frequency of the appearance of mosaic plants (up to 100%) were observed in the Nu6 line [16]. Differences in manifestation of the mosaic pattern of the nptII gene expression correlated with a different level of its methylation in the promoter region and transcribed part of the selective gene [17]. Differences between the lines also persisted when transferring them to the tetraploid level [18].

The aim of this work was to identify the features of expression of target and selective genes included in complex insertion of two lines of tobacco transgenic plants (Nu5 and Nu6) contrastingly differing from each other in mosaic manifestation of the nptII gene and to establish the triggers (aberrant RNAs) involved in its inactivation.

MATERIALS AND METHODS

Initial Material

T3 descendants of two lines of transgenic tobacco plants (Nu5 and Nu6) contrastingly differing in the expression of the selective nptII gene obtained as a result of sequential selection for mosaicism among self-pollinated descendants of the initial transgenic Nu21 plant served as a model to detect the features of the expression of transgenes as a part of complex insertion [15]. The scheme of complex insertion, which includes three T-DNA copies, one of which is inverted relative to the other two copies with deletion of most of the nptII gene (truncated T-DNA copy), is presented in Fig. 1. The transgenic Nicotiana tabacum L. Nu21 plant was obtained by a method of agrobacterial transformation of the genetic construction pC27-nuclS with the nptII gene (providing resistance of tobacco plants to kanamycin antibiotic) and the Serratia marcescens secretory endonuclease gene under control of the bidirectional MAS promoter of the mannopine synthase gene of A. tumefaciens Ti plasmid [15]. Phenotypically unstable expression of the nptII gene, expressed as an alternation of green (kanamycin-resistant) and white (kanamycin-unstable) regions on the leaf surface, is a distinctive feature of descendants of Nu21 plant. For the comparative analysis of the nptII gene expression in two lines contrastingly differing in the manifestation of mosaicism, as well as in detection of the reading of aberrant transcripts in the region of genetic construction, plants hemi- and homozygous for the insertion of T-DNA of the third generation from self-pollination were used. The homozygosity of descendants was determined by the absence of cleavage in the next generation during self-pollination (all descendants were kanamycin-resistant); the hemizygosity was determined by the presence of cleavage 3 : 1 (kanamycin-resistant : kanamycin-unstable).

Fig. 1.
figure 1

Schematic of T-DNA insertion in Nu21 plant genome. nuclS, S. marcescens secretory endonuclease gene; nptII, E. coli neomycin phosphotransferase II gene; РMAS, bidirectional promoter of A. tumefaciens Ti plasmid mannopine synthase gene; LB, RB, repeats limiting T-region of A. tumefaciens Ti plasmid; curly arrows demonstrate the orientation of the copies in T-DNA insertion; the primers and their direction are demonstrated by arrows; 1RT, 2RT, nuclS-RT, primers for reverse transcription; 1-1, 1-2, 2-1, 2-2, nuclS-1, nuclS-2, primers for RT-PCR on cDNA.

Analysis of Expression of Transgenes (nptII and nuclS)

Leaves of five monthly tobacco plants were used to analyze the expression of transgenes. Total RNA was isolated using the RNAeasy® Plant Mini Kit (Quiagen), RNA was treated with DNase I, and 4 µg RNA was taken for cDNA synthesis (Thermo Scientific RevertAid First Strand cDNA Synthesis Kit).

The analysis of the nptII gene expression was performed using real-time PCR on a CFX96 amplifier (Bio-Rad, United States). The primers were selected for the central region of the nptII gene in full-size T‑DNA copies (Table 1). The amplification program: 95°С for 3 min; then five cycles without detection: 95°С for 10 s, 61°С for 20 s, 72°С for 5 s; then 40 cycles with detection at the annealing stage (FAM channel): 95°С for 10 s, 61°С for 20 s, 72°С for 5 s. The expression level was estimated using Bio-Rad CFX Manager 2.1 software. Each sample was analyzed in three repetitions; data normalization was carried out according to the host glutamine synthetase (GSP) gene [19], which was analyzed in the same tube.

Table 1. Primer structure

The analysis of the expression of the Serratia marcescens secretory endonuclease gene was carried out by a method of semi-quantitative RT-PCR. The primers are indicated in Table 1. The program of cDNA amplification: 1 cycle at 94°С for 3 min, 58°С for 30 s, 72°С for 1 min; then 34 cycles: 94°С for 1 min, 60°С for 30 s, 72°С for 1 min. The primers for the actin gene were used as a control.

The experiments were repeated two times in duplicate.

Identification of Aberrant RNAs

The location of the primers used to identify aberrant sense RNAs read from a MAS promoter of the truncated nptII gene copies and antisense aberrant RNAs read from putative promoters in the region of the spacer sequence of the truncated nptII gene copy is given in Fig. 1; the primer sequences are presented in Table 1. The program of cDNA amplification: 1 cycle at 94°С for 3 min, 58°С for 30 s, 72°С for 1 min; then 34 cycles: 94°С for 1 min, 60°С for 30 s, 72°С for 1 min. The primers for the actin gene were used as a control.

Analysis of Vector DNA Integration

DNA was isolated using the GenElute© Plant Genomic kit (Sigma). The analysis for the presence of vector DNA insertion in the plant genome was carried out by PCR with the primers TiL_U and pTi_p2_L for the analysis of the left region adjacent to the insertion from the pC27-nuclS plasmid and with the primers npt_p3_R and TiR_L for the analysis of the right adjacent region (Table 1). The amplification mode: 1 cycle: 95°C for 3 min, 58°C for 30 s, 72°C for 1 min; 32 cycles: 95°C for 30 s, 60°C for 30 s, 72°C for 1 min.

Statistical Analysis

To compare the expression level in homo- and hemizygous groups of transgenic plants, Kruskal–Wallis nonparametric analysis of variance was used (Statistica 5.5 software package) taking into account multiple pairwise comparison by Dunn’s criterion (Qcr (k = 4, α = 0.05) = 2.639), where k is the number of compared samples (hemizygotes and homozygotes of the Nu5 and Nu6 lines) and α is the level of significance [20].

RESULTS AND DISCUSSION

Expression of nptII Gene and S. marcescens Secretory Endonuclease Gene in Hemi- and Homozygous Nu5 and Nu6 Tobacco Plants

The results of comparative analysis of the nptII gene expression in two tobacco lines (Nu5 and Nu6) contrastingly differing in the frequency of detection of mosaicism are presented in Fig. 2. There is a wide variability in the number of mRNA transcripts synthesized both from single (in hemizygotes) and from two (in homozygotes) DNA matrices. Analyzing the relative values of representation of nptII gene transcripts in hemizygous transgenic plants of the Nu5 and Nu6 tobacco lines (Fig. 2a), one should note the absence of statistically significant differences in the level of selective gene expression (Q = 1.715).

Fig. 2.
figure 2

Level of nptII gene expression in hemizygous (a) and homozygous (b) transgenic tobacco plants with low (Nu5) and high (Nu6) level of appearance of mosaic descendants. Nu5 line plants, columns in black; Nu6 line, columns in gray. Expression level normalized with respect to the host GSP gene is indicated along the vertical axis; the numbers of the transgenic tobacco plants are indicated along the horizonal axis. Standard error of the mean is shown.

The expression level no different from the expression level in hemizygous plants is also noted in homozygous plants of the Nu5 line (Fig. 2b) (Q = 0.857), which suggests a decrease in the expression level of the selective gene during the transition from the hemi- to homozygous state. At the same time, a significant decrease in the expression of the nptII gene is observed in homozygous plants of the Nu6 line (Q = 2.939). A low level of the expression of transgene was detected only in two out of six analyzed plants; in the remaining descendants, the expression was not determined, which indicates inactivation of the selective gene expression.

Previously, we established that a statistically significantly lower frequency of descendants with a mosaic pattern of the nptII gene expression is noted in the Nu5 line as compared with the Nu6 line, while a high frequency of mosaics (up to 100%), sharply depressed growth, and death on selective medium are typical of homozygous Nu6 plants [16].

The results of quantitative estimation of the transcriptional activity of the nptII gene in two transgenic tobacco plant lines differing in mosaicism confirmed previously established significant differences indicating that higher values of the nptII gene transcriptional activity correlate with a low frequency of its inactivation at the level of somatic tissue (Nu5 line) and, on the contrary, a decrease in the level of nptII gene expression is associated with a high frequency of inactivation and the appearance of mosaic descendants (Nu6 line).

Thus, statistically significant differences in the level of the selective gene expression between homozygous plants are noted between the studied lines of transgenic tobacco Nu5 and Nu6. A general trend toward a decrease in the transcriptional activity of the nptII gene is observed during its transition from the hemi- to homozygous state, which corresponds to the frequency characteristics of the analyzed gene inactivation previously established by its phenotypic manifestation. A decrease in the nptII gene activity is more pronounced for Nu6 line plants, since the genotypes with a maximum manifestation of mosaicism were selected namely in this line among the descendants to obtain the next generation. In this regard, it was of interest to estimate the level of expression of another gene included in the expression cassette as a target (S. marcescens secretory endonuclease gene) (Fig. 1). Obtaining such data will make it possible to judge whether there is a coordinated inactivation of another gene included in a complex insertion.

The results of RT-PCR on hemi- and homozygous plants of the Nu5 and Nu6 lines are presented in Fig. 3. In Nu5 line plants, the endonuclease transcripts are registered in all analyzed plants (hemi- and homozygous) at the level of the actin gene expression. For the Nu6 line, the transcript is detected only for a number of hemizygous plants (Fig. 3a, numbers 6/3, 6/10, 6/21), and their expression level is lower than the expression level of actin gene and is completely absent in all analyzed homozygous descendants (data not presented). Consequently, inactivation of transgene expression in the Nu6 line in homozygous descendants can occur throughout the T-DNA insertion, that is, capture all copies of the nptII gene and S. marcescens secretory endonuclease gene. As a rule, genetic constructions carrying several genes (marker/selective gene, target gene) are transferred to plants. In the works of a number of researchers, it was demonstrated that a violation in the expression of one of the genes can correlate with inactivation of adjacent genes [21, 22] or not affect the stability of their expression [23, 24].

Fig. 3.
figure 3

Electropherogram of PCR products after reverse transcription of total RNA of transgenic Nu5 and Nu6 line plants with the primers to the S. marcescens secretory endonuclease gene and actin. (a) Hemizygous plants; (b) homozygous plants. M—DNA molecular weight marker.

A decrease in the expression level of the genes included in the multicopy insertion during the transition from the hemizygous to homozygous state is noted in many works. This phenomenon is called the gene dose effect, and the suppression of gene expression occurs when homologous sequences are found in both allelic (in homozygous descendants) and non-allelic positions (when crossing transformants) [25, 26]. Reading antisense and aberrant RNAs is one of the most studied triggers when starting gene silencing at the transcriptional or post-transcriptional levels [12, 13].

Antisense and Aberrant RNAs Read in the Region of Truncated Inverted Copy of nptII Gene

T-DNA insertion in the transgenic Nu5 and Nu6 lines has a complex tandem structure; therefore, there is a high probability of reading aberrant transcripts from the MAS promoter of the truncated inverted gene copy, where there was a deletion of most of the nptII gene (95 bp instead of 794 bp in the full copy), and from potential promoter regions in the region of the spacer sequence adjacent to the deleted gene. Since the transcription terminator is absent in this T‑DNA copy, the transcription of sense and antisense RNAs, including an incomplete coding sequence of the nptII gene, as well as noncoding adjacent DNA sequence, is possible. RT-PCR data after the reaction of reverse transcription of total RNA of Nu5 and Nu6 line homozygous transgenic plants with the primers capturing the coding sequence of the nptII gene truncated copy and adjacent spacer sequence unique for the whole transgenic insertion are presented in Fig. 4; this makes it possible to register RNA read only from the truncated copy (Fig. 1, Table 1; the primer 1RT was used for reverse transcription; the primers 1-1 and 1-2 were used for RT-PCR). The presence of the 366 bp PCR product indicates the synthesis of antisense aberrant RNA, which is registered in Nu5 plants only in descendants with a high level of transgene expression (5-10, 5-16, 5-21, Fig. 2); in plants with a lower expression level, no antisense aberrant RNA is registered using this method. The PCR product of antisense aberrant RNA is completely absent in homozygous plants.

Fig. 4.
figure 4

Electropherogram of PCR products after reverse transcription of total RNA of transgenic Nu5 and Nu6 line plants with the primers to antisense aberrant RNA and actin. M—DNA molecular weight marker.

Figure 5a shows an electropherogram of PCR products after reverse transcription of total RNA of Nu5 and Nu6 line hemizygous transgenic plants with the primers for sense aberrant RNA read in the region of unique spacer sequence, truncated nptII gene copy, and spacer sequence between the nptII gene and MAS promoter (Fig. 1; the primers 2RT for the reverse transcription and 2-1 and 2-2 for RT-PCR). The target amplification product (311 bp) is detected in all analyzed Nu5 line plants; for the Nu6 line, the transcript is detected only for some descendants (6-19, 6-21, 6‑67) as a weak signal, as compared with the control (actin gene).

Fig. 5.
figure 5

Electropherogram of PCR products after reverse transcription of total RNA of transgenic Nu5 and Nu6 line plants with the primers to sense aberrant RNA and actin. (a) Hemizygous plants; (b) homozygous plants, reamplification. M—DNA molecular weight marker.

In the case of homozygous transgenic plants of the Nu5 and Nu6 lines, no sense aberrant RNA was found during a standard procedure of semi-quantitative PCR (data not presented), but the presence of transcription was detected for the Nu5 line with an increase in the reaction time of reverse transcription by 2 times or reamplification RT-PCR (Fig. 5b). Consequently, aberrant RNA is synthesized in small amounts in the Nu5 line homozygous transgenic plants. For the homozygous Nu6 line, the transcript was not found even after reamplification, which indicates its absence (data not presented). Thus, there is a correlation of coordinated decrease or inactivation of the nptII gene expression in all T-DNA copies (both full-size and truncated) in homozygous descendants of the Nu5 and Nu6 lines.

Since the synthesis of sense and antisense aberrant RNAs occurs in the region of the inverted truncated copy of the nptII gene, they can form double-stranded, partially complementary RNAs both among themselves and with the transcripts read from full nptII gene copies in T-DNA 1 and T-DNA 2. The resulting double-stranded RNA can with high probability be responsible for the activation of the mechanisms of RNA-mediated gene silencing at the transcriptional level.

The association between the synthesis of aberrant RNAs and inactivation of the Lpt2-gus transgene was demonstrated for transgenic rice plants. The researchers obtained the lines with a mosaic expression of the gus gene in the aleurone layer of rice seeds. A mosaic pattern of the reporter gene expression was inherited among generations. The study of one of these lines demonstrated that one of two T-DNA copies in these plants is truncated and inversely oriented, which leads to the formation of antisense aberrant RNAs [10].

In the works on transgenic Arabidopsis thaliana plants, it was demonstrated that genetic constructions without the transcription terminator or incomplete copies of the genes also efficiently start the process of inactivation of the expression of homologous genes in the host genome [12, 27]. A high frequency of inactivation is provided by genetic constructions that include the gene/gene fragment between oppositely directed promoters. In this case, short non-polyadenylated RNAs of different sizes are formed [11]. The researchers found that internal promoter-free regions of T-DNA are frequently transcribed in 65% of transformed cell lines of BY-2 tobacco. Such spontaneous transcription triggers the gene inactivation in the presence of inverted repeats as a part of an insertion. The authors explain this phenomenon by T-DNA insertion into transcriptionally active plant genome regions and by features of the chromatin structure [13].

The genomic environment in which there were transferred heterologous genes certainly affects the stability of their expression. Thus, modeling a situation where the transgenes fall into the region of highly repeatable sequences by introducing repeated sequences from the Petunia hybrida genome into the genetic construction resulted in inactivation and mosaic expression of the marker gene [28]. In a number of works, it was demonstrated that the aberrant transcripts can initiate the synthesis from promoters in sequences adjacent to T-DNA and direct silencing of transgenes [13, 29].

It is known that the insertion of plasmid DNA regions adjacent to the T-DNA region is possible together with T-DNA during agrobacterial transformation in the plant genome. The presence of such regions with a large number of repeats can be an additional factor for the manifestation of RNA interference. The probability of integration of vector DNA fragments during agrobacterial transformation largely depends on the type of plasmid and conditions of transformation; for Nicotiana tabacum, this probability is about 1.3%, but in some cases, it reaches 70% [30]. We tested this hypothesis for the Nu21 line and demonstrated the absence of sequences from the pC27-nuclS plasmid adjacent to right/left borders of T-DNA (data not presented).

It is interesting to note that even the insertion of one copy of the uidA gene in the same host genome region by a Cre/lox recombination system resulted in significant differences in the expression level between transformants and in the appearance of plants with inherited mosaic pattern of the transgene expression [31]. Mosaicism also occurs in interspecific hybrids or with chromosomal rearrangements leading to an unstable state of the genes. In maize, insertions in the enhancer and the presence of unique copies in repeats of the P1-mm allele as compared with the P1-wr allele were associated with mosaic-stained pericarp. All this may be a prerequisite for the formation of aberrant RNAs and activation of silencing mechanisms [32]. A mosaic staining of the corolla in the Petunia hybrida is associated with the presence of two tandemly located CHS-A gene copies in the genome [33]. Consequently, both the genomic environment and internal features in the structure of foreign insertion can have a significant effect on the stability of manifestation of foreign genes.

Thus, in transgenic Nu5 and Nu6 line tobacco plants contrastingly differing in the mosaic pattern of expression of the selective nptII gene, the formation of sense and antisense aberrant incomplete RNA transcripts read from the truncated inverted nptII gene copy is the most probable trigger starting the inactivation of genes in complex insertion. Differences between lines in the efficiency of starting this mechanism most likely lie outside a complex insertion and are associated with the features of the genomic environment of the integration region of the studied insertion T-DNA, which is confirmed by the efficiency of selection for a decrease/increase in mosaicism among descendants of the following generations. These lines of transgenic tobacco plants are of undoubted interest for further study of the reasons and the mechanisms of instability of expression and inheritance of foreign genes associated with the effect of epigenetic mechanisms.