INTRODUCTION

Regulation of ribosomal protein synthesis in Escherichia coli cells was discovered more than 40 years ago and has been well studied. Protein genes in prokaryotes are usually organized in functional units called operons, which are under the control of the operator. Ribosomal protein (r-protein) gene operons may also include non-ribosomal protein genes. For example there are genes in E. coli encoding components of the replication complex (dnaG encodes primase and priB encodes primosomal protein N), RNA polymerase subunits (rpoA, rpoB, rpoC, and rpoD), translation factors (tsf, fus, and tufA), genes whose products are involved in rRNA maturation (rimM), tRNA processing (trmD) and modification (rnpA), and protein export across the membrane (secY). Ribosome biogenesis requires approximately equimolar amounts of r-proteins and rRNA. Many prokaryotic r-proteins regulate the expression of both their own genes and those of other proteins in their operon (Fig. 1). Regulatory proteins include both proteins of the Small ribosomal subunit (S1, S4, S7, S8, S15, and S20) and Large ribosomal subunit (L1, L4, L10, L12, and L20).

Fig. 1.
figure 1

Organization of the E. coli r-protein genes in operons. Repressor proteins are shown as circles; the regions of mRNA with which they interact are indicated by the symbol ┴. Genes whose expression is regulated by the repressor protein are indicated in white; genes that are not regulated by the corresponding r-protein are highlighted in gray. P—promoters; t and att—transcription terminator and attenuator, respectively. (a) S10 operon. (b) L11 operon. (c) S1 operon. (d) S2 operon. (e) L20 operon. (f) S6 operon. (g) L10 operon. (h) S15 operon. (i) S4/α operon. (j) S8/spc operon. (k) S7/str operon. (l) S20 operon.

The feedback principle is the main principle of r-protein synthesis regulation, which means that one of the r-proteins encoded by an operon acts as a repressor of mRNA translation of the entire operon when it is over-synthesized. Moreover, the operator can be located both above the first gene and between the operon genes. The mRNA binding sites of some repressor proteins are homologous to the rRNA binding sites of these proteins; however, many r-proteins regulate protein synthesis by binding to an mRNA site that does not have explicit homology to a specific rRNA site.

Autoregulation of r-protein synthesis can be performed by various mechanisms. Competitive inhibition involves competition between specific binding sites of the regulatory protein on mRNA and rRNA. Under the regulation by the “entrapment” mechanism, the interaction of the repressor protein with mRNA leads to the formation of structures that hinder the initiation of mRNA translation. Retroregulation involves the destabilization of mRNA upon interaction with a repressor protein. In addition, regulation can be carried out both by single proteins and their complexes.

REGULATION OF RIBOSOMAL PROTEIN SYNTHESIS BASED ON COMPETITIVE INHIBITION

The competitive mechanism of autoregulation of r-protein synthesis involves competition between mRNA and rRNA for binding to the repressor protein. As a rule, the repressor’s affinity for mRNA is significantly lower than for rRNA.

S10 Operon

Expression of the S10 operon of E. coli containing 11 r-protein genes is controlled by the L4 r-protein (Fig. 1a) [1]. L4 regulates both translation and transcription of its own operon genes [2].

In the bacterial ribosome, L4 binds mainly to the I 23S rRNA domain, and its extended loop forms part of the polypeptide exit tunnel [3]. It is believed that the C-terminal part of the protein is responsible for its regulatory properties, and the central part is necessary for embedding in the ribosome [4].

The binding sites of the L4 protein on 23S rRNA and mRNA do not have a clear similarity (Fig. 2) [5]. The intersection of four helices containing a loop is the main site of interaction of the L4 protein with 23S rRNA (Fig. 2b) [6]. Upon binding to the S10 operon mRNA region, the L4 protein controls the synthesis of S10 operon proteins both at the transcription and translation levels, it is located in the 5'-untranslated region (5'-UTR) of the first gene of the rpsJ operon and contains HD–HG hairpins (Fig. 2a).

Fig. 2.
figure 2

Secondary structure of the L4 protein binding sites on RNA in E. coli. (a) 5'-UTR of the S10 protein gene (rpsJ). The nucleotides of the transcription terminator are highlighted in blue; the SD sequence and the rpsJ start codon are highlighted in green. The minimal mRNA fragment with a high affinity for the L4 protein is highlighted by a dashed frame. (b) a minimal fragment of 23S rRNA that specifically binds to the L4 and L24 proteins. The sites with which the L4 protein interacts are highlighted in red.

It is known that the mRNA regions of the S10 operon of E. coli responsible for transcription termination and translation inhibition partially overlap, but are not identical [7, 8]. 5'-UTR of the operon contains six hairpins (Fig. 2a); however, the first three hairpins are insignificant for the regulation of operon expression in vivo. The HD and HE hairpins are necessary for transcription control and the HE and HG hairpins are most important for translation control [7, 9, 10]. The HE hairpin and the uridine-rich sequence located behind it form the site of the ρ-independent transcription terminator. The minimal fragment of the E. coli S10 operon mRNA, which has a high affinity for the L4 protein, contains the HD hairpin and part of the HE hairpin [5]. The conserved structure of the regulatory region of the S10 operon mRNA was predicted based on bioinformatic analysis [11]. It was assumed that the HD hairpin loop and conserved unpaired nucleotides flanking this hairpin is the main L4 binding site (Fig. 2a) [12].

Transcription is regulated by the interaction of the L4 protein with 5'-UTR of mRNA, the NusA transcription factor, and RNA polymerase [1]. The L4 protein, upon interaction with NusA, increases the pause time of RNA polymerase at the terminator site of the HE hairpin, which leads to premature transcription termination [1315].

In some bacteria, L4-mediated regulation is absent. As an example, the L4 protein of Pseudomonas aeruginosa does not inhibit the synthesis of its own operon proteins, and the 5'-UTR region of S10 mRNA lacks determinants similar to the binding site of this protein in E. coli. The P. aeruginosa operon carrying the L4 protein gene also contains the L24 protein gene, and the region around 5'-UTR of mRNA of the rplC gene is similar to the binding site of these proteins in the ribosome. Therefore it was assumed that both L4 and L24 proteins are involved in the regulation of the S10 operon of P. aeruginosa. Apparently, the possible mechanisms of regulation of this operon are different in different organisms [16].

The S10-like operon in archaea Methanocaldococcus jannaschii encodes not 11, as in E. coli, but five genes of r-proteins (L3, L4, L23, L2, and S19). The first gene of this operon is the gene for the L3 protein, not for the S10 protein. The synthesis of proteins of this operon in M. jannaschii is also regulated by the L4 protein. The L4-binding site includes 5'-UTR and the beginning of the coding part of the L3 protein mRNA [17].

L11 Operon

The conserved two-domain L1 r-protein regulates translation of the bacterial L11 operon mRNA. This protein participates in the formation of the L1 protrusion of the 50S ribosome subunit; it strongly and specifically binds to 23S rRNA in the region of helices 76–78. When there is a lack of 23S rRNA, the L1 protein binds to a specific part of the mRNA of its own operon and prevents its translation. The regulation of the L11 operon of E. coli [1620] and the L1 operon of archaea of the genus Methanocaldococcus has been studied in detail [21, 22].

The binding site of the L1 protein on the L11 operon mRNA of E. coli is located in 5'-UTR of the L11 protein mRNA (Fig. 3). The coupling of translation of the operon genes leads to coupling of translation repression [23, 24]. In the archaea M. vannielii and M. jannaschii, the L1 protein binding site is located in 5'-UTR of its own mRNA (Fig. 3). Autoregulation of the L1 protein synthesis of M. vannielii occurs either before the formation of the first L1 peptide bond, or at this stage [25].

Fig. 3.
figure 3

Organization of the L11 operon genes in bacteria and the L1 operon in archaea. The repressor protein is indicated by a circle, and the regions of mRNA with which it interacts are shown by the symbol ┴.

The search for L1 protein binding sites on the mRNA of various bacteria showed that the position of the L1 binding site is not strictly conserved [11]. In some groups of bacteria, these sites are located before the L11 protein gene (Proteobacteria, Spirochaetes, Thermotogae, and Tenericutes), in others they are located before the L1 protein gene (Cyanobacteria, Actinobacteria, and Chloroflexi), and L1 protein binding sites are found before both genes in 40% of Firmicute genomes (Fig. 3). Recently, two L1 regulatory sites were found on the L11 operon mRNA of the Thermotoga maritima: the first in 5'-UTR of the L11 protein mRNA, as in E. coli, and the second includes the leader and coding regions of the L1 protein mRNA [26].

The structures of L1 protein binding sites on rRNA are conserved in all domains of life [27, 28]. The binding sites of the L1 protein on mRNA have a high homology of the primary and secondary structure with its binding site on rRNA [22, 29] (Fig. 4). However, the binding constants of the L1 protein to rRNA and mRNA differ by about an order of magnitude, as a result of which the translation of the L11 operon is regulated according to the classical feedback principle [24].

Fig. 4.
figure 4

L1 protein binding sites on 23S rRNA (T. thermophilus) and mRNA (E. coli and M. jannaschii). Nucleotides forming conserved contacts with the L1 protein on RNA fragments are highlighted in red.

Since the L1 protein binding sites on mRNA and rRNA are homologous in bacteria and archaea (Fig. 4), L1 of archaea M. vannielii is able to functionally replace the L1 protein of E. coli both as a part of the ribosome and as a translation repressor; and L1 of E. coli inhibits translation of the L1 operon mRNA of M. vannielii in vitro [22, 30]. The L1 protein of the bacterium Thermus thermophilus is able to regulate the synthesis of the L1 operon proteins of the archaea M. vannielii in vitro [31].

Structural and biochemical studies indicate a leading role of the L1 protein domain I in interaction with RNA in bacteria [31–33] and archaea [34].

S1 Operon

The S1 r-protein encoded by the rpsA gene regulates its own synthesis at the translation level (Fig. 1b) [35, 36]. S1 is one of the proteins of the small ribosomal subunit located between the head and platform of the 30S subunit; it contacts with mRNA, r-proteins [37, 38] and RNA polymerase [39, 40].

It is known that the S1 protein is necessary for translation of some mRNAs [35], including its own [41]. The protein contains six domains. Three N-terminal domains (D1–D3) interact with r-proteins, as well as with various mRNAs in the ribosome, and have RNA chaperone activity [42]. These domains are involved in the formation of the 30S preinitiation complex of the ribosomal subunit with mRNA. C-terminal domains D3–D6 provide specificity for recognizing single-stranded regions of various mRNAs [43, 44]. It is known that the binding site of the S1 protein on its mRNA is located in 5'-UTR, it is formed by three hairpins (I–III) separated by AU-rich single-stranded regions (ss1 and ss2) (Fig. 5a). Hairpin III contains a start codon and an SD-like element (GAAG) (Fig. 5a) that forms only three complementary base pairs with anti-SD 16S rRNA [45]. However, the S1 operon is one of the strongest operons in E. coli with effective negative autogenic control [36]. The GG(A) sequences in the loops of the hairpins I and II of mRNA together with the weak SD can form a common SD element separated in space (Fig. 5b) [35]. Regulation of S1 protein synthesis occurs at the level of 30S preinitiation complex formation. A 30S ribosomal subunit without the S1 protein is not able to form a preinitiation complex with its own mRNA in vitro. The addition of the S1 protein to 30S subunit with S1 deficiency in a 1 : 1 molar ratio restores their ability to bind rpsA mRNA, while protein excess inhibits this binding (Fig. 5b).

Fig. 5.
figure 5

Regulatory region of rpsA mRNA. (a) Secondary structure of the S1 protein translation initiation site on rpsA mRNA. Conserved GG nucleotides are marked in gray. The SD sequence is highlighted with a frame. (b) Scheme of autogenic regulation of the S1 protein synthesis.

The S1 protein interacts with single-stranded ss-1 and ss-2 sites on rpsA mRNA (Fig. 5a). This interaction changes the structure of the rpsA translation initiation site, disrupting its active conformation, and prevents the formation of a preinitiation complex with the 30S subunit [35] (Fig. 5b).

The secondary structure of 5'-UTR of rpsA mRNA is similar in five families of γ-proteobacteria (Enterobacteriaceae, Pasterellaceae, Vibrionaceae, Erwiniaceae, and Shewanellaceae) [45]. Hairpins II and III are quite conserved; the interhelical region usually contains an AU-rich sequence [11]. A weak helix III always contains an SD-like element and an AUG in the loop, while the loops of hairpins I and II have GGA triplets (Fig. 5a).

S2 Operon

The S2 operon (the rpsB–tsf operon) of bacteria encodes the S2 r-protein and the Ts elongation factor (Fig. 1d). In E. coli cells this operon is regulated at the level of translation by the S2 protein [46].

The globular domain of the S2 protein interacts with the “body” of the 30S ribosome subunit, and the double-stranded domain is directed to the “head” of the small subunit. It has been shown that the S2 protein on the ribosome participates in SD binding at the translation initiation stage [47]. Homology of the S2‑binding site on mRNA and on 16S rRNA (h26 and h35–h37) has not been found [48].

The regulatory site that the S2 protein interacts with is located in 5'- UTR of rpsB mRNA. The CR and RH regions of rpsB mRNA are the most important for regulation (Fig. 6). Interestingly, the S2 protein regulates rpsB–lacZ expression more effectively in the presence of the S1 protein [46]. The S2 protein is required for embedding S1 in the ribosome [40] and can form a complex with S1.

Fig. 6.
figure 6

The secondary structure of 5'-UTR of rpsB mRNA of E. coli. LH and RH are conserved stem-loop structures; CR is a central, weakly structured region. Conserved nucleotides are highlighted in red.

The rpsB gene promoter elongated with the TGTG “–10” sequence and the secondary structure of 5'‑UTR of rpsB mRNA are conserved in γ-proteobacteria [49]. The tsf gene does not have its own promoter and EF-Ts is synthesized with the bicistrone rpsB–tsf mRNA [50, 51]. The genes are separated by an extended region containing inverted repeats followed by an attenuator [52]. Upon S2 protein binding to the rpsB mRNA site, EF-Ts synthesis is also inhibited [46]. The activity of the S2 operon promoter decreases with amino acid starvation in vivo or with an increase in the concentration of the ppGpp alarmon in vitro. For this regulation, the GC-rich nucleotide sequence that separates the “–10” element from the start of transcription is important [49].

Thus, the regulation of the synthesis of the S2 and EF-Ts proteins is carried out both by the S2 protein at the translation level on the feedback principle, and at the transcription level by the global regulator ppGpp. Moreover, effective and regulated transcription of the S2 operon requires a combination of all the conserved elements of the rpsB promoter [49].

L20 Operon

The L20 operon (the rpmI–rplT operon) includes the genes of the L35 (rpmI) and L20 (rplT) r-proteins and translation initiation factor 3, IF3 (infC) (Fig. 1e). The L20 protein directly inhibits the translation of the cistron of L35 coupled with the translation of its own cistron [53]. The protein primarily binds to rRNA of the 50S subunit [54], and interacts with the site between the 40 and 41 helices of the 23S rRNA.

The C-terminal domain of the protein is sufficient to repress translation of the L20 operon in vivo [55]. Two regulatory sites were found in the intercistrone region of infC–rpmI mRNA of E. coli, with which L20 interacts with the same affinity (Fig. 7). The first site includes a pseudoknot formed by the infC and rpmI regions (Figs. 7a, 7b) [55]; it promotes the coupling of translation of IF3 and cistrons of r-proteins [56]. The second site (Figs. 7a, 7c) contains the central part of the t1 helix, the structure of which is similar to the binding site of the L20 protein on 23S rRNA (Fig. 7d) [57]. The presence of two L20 binding sites was confirmed in vivo by mutation analysis of the L20 operon mRNA [58]. These regions are located near the pseudoknot, forming one region in the three-dimensional structure of the operator. The minimum mRNA region required for translation repression includes the pseudoknot e and the lower 2/3 part of the t1 helix (Fig. 7) [55, 58].

Fig. 7.
figure 7

The secondary structure of three L20-binding sites on RNA. Conserved nucleotides in all regions are highlighted in red. (a) The region of infc–rpmI mRNA of E. coli. (b) The first site of L20 binding to mRNA, a pseudoknot in which the S2 helix is formed by mRNA regions that are far from each other (shown by converging arrows). (c) The second L20 binding site on mRNA. The start codon of the rpmI gene and the stop codon of the infC gene are highlighted. (d) L20 binding site on 23S rRNA of E. coli.

It was assumed that the L20 protein regulates the synthesis of its own operon proteins by competition between the repressor and the ribosome for binding to mRNA [59]. During transcription of the rpmI operator, the S1 mRNA hairpin is synthesized first (shown in blue in Fig. 8, step 1); then the t1 hairpin is synthesized (shown in purple in Fig. 8, step 1). After the mRNA hairpin is formed, the L20 protein strongly binds to the site on this hairpin (Fig. 8, step 2). After that, the mRNA region corresponding to the 3' end of the S2 hairpin is synthesized (Fig. 8, step 3, the linear region of the 3' end of mRNA is shown in blue); this leads to the formation of a pseudoknot (Fig. 8, step 4). After the operator site has accepted the necessary conformation, the previously bound L20 protein molecule changes its position on the mRNA (Fig. 8, step 5) and blocks the site of interaction with the ribosome. It was assumed that the interaction of L20 with the t1 mRNA hairpin allows a temporarily increase in the local concentration of the L20 protein near mRNA until the mRNA pseudoknot is formed. The structure of the L20 complex with the operator part of mRNA is currently unknown, but it has been shown that the protein binds to this region with a molar ratio of 1 : 1 [59].

Fig. 8.
figure 8

The proposed model of regulation of rpmI–rplT genes by the L20 protein. The regions of mRNA forming the pseudoknot: S1 hairpin of infC mRNA and the 5'-UTR region of rpmI mRNA forming the S2 helix are highlighted in blue. The t1 hairpin of infC mRNA is shown in purple. Yellow indicates a region of the rpmI gene, L20 r-protein is shown as a green oval.

In B. subtilis only one site with which the L20 protein interacts was found, and it differs from the regulatory site of E. coli. This region is located in 5'-UTR of infC mRNA, and regulation is performed at the transcription level, but not at the translation level. However, this mRNA site also has similarities to the L20-binding site of the 23S rRNA of B. subtilis [60]. Thus, despite the difference in the structures of L20-binding RNA sites in different organisms, the protein has similar determinants for interaction on mRNA and rRNA.

S6 Operon

In the genomes of many bacterial species, the rpsF gene encoding the S6 r-protein is located next to the priB (a component of the primosome) and rpsR (the S18 r-protein) genes (Fig. 1f). The S6 operon (the rpsF operon) of E. coli also includes the rplI gene (the r‑protein L9) [61]. The S6 and S18 proteins are not primary rRNA-binding proteins; they form a heterodimer that interacts with the 16S rRNA region associated with the S15 r-protein [62–64]. The CCR nucleotides (R = A/G) in 16S rRNA are the only conserved region that specifically interacts with the S6∙S18 complex (Fig. 9b). A similar sequence is found in the S6∙S18 binding site on rpsF mRNA (Fig. 9a), indicating the key role of this element in the interaction of the protein complex with RNA [65], the affinity of S6∙S18 for rRNA is higher than for mRNA [66].

Fig. 9.
figure 9

The secondary structure of S6⋅S18 binding sites of RNA. (a) An assumed secondary structure of the rpsF mRNA region of E. coli. The SD sequence is highlighted in green. (b) A fragment of 16S rRNA of T. thermophilus. The nucleotides forming the S6⋅S18 binding site are outlined with frames; the conserved CCR motif is highlighted in red.

The RNA-protein contacts in the ribosomal and S6∙S18 regulatory complexes are conserved. Substitutions of amino acid residues in the S18 protein, which lead to loss of protein affinity to rRNA, also weaken its interaction with mRNA [65, 67]. Mutations that disrupt protein–protein contacts in the S6∙S18 complex also lead to a decrease in the level of regulation of S6 operon protein synthesis [63, 67]. Analysis of 5'-UTR of the S6 operon in γ-proteobacteria, Firmicutes, and Tenericutes showed that the CCR motif is part of the loop of the conserved P1 mRNA hairpin [65], with SD at the 3' end (Fig. 9a). In the mRNA complex model, the S6 protein that interacts with the small groove of h22 and h23b of rRNA in the ribosome forms bonds with the P1 helix. The S18 protein can also contact the P1 helix and the CCR motif (Fig. 9a).

Binding of S6∙S18 to mRNA stabilizes its structure making SD unavailable for interaction with the ribosome and inhibiting the translation of the S6 operon mRNA by the feedback principle. Mutation analysis showed that the P1 helix sequence of rpsF mRNA is important for translation efficiency in both E. coli and B. subtilis cells [66, 67], but the S18 protein of B. subtilis has a weak affinity for mRNA in the absence of the S6 protein [66].

L10 Operon

The L10 operon (the rplJL operon) of E. coli contains the genes of the L10 (rplJ) and L12 (rplL) r-proteins as well as the β and β' subunits of RNA polymerase (Fig. 1g). As a result of attenuation of transcription and processing of rplJL mRNA, two separate transcripts are formed, one of which contains cistrons of r-proteins, and the other contains cistrons of the subunits of RNA polymerase. Translation of rplL and rplJ cistrons is coupled [23] and regulated by competitive inhibition by a complex consisting of one L10 protein molecule and four L12 molecules (L10∙(L12)4 [6872].

In representatives of seven genera of enterobacteria, including E. coli, the rplКАJL genes are divided into two operons: L11 (rplКА) that is regulated by the L1 protein and L10 (rplJL) that is regulated by the L10 protein [23, 24]. In archaebacteria, the rplАJL genes are transcribed as tricistronic mRNA, whose translation is regulated by the L1 protein [73].

The L12 protein on the ribosome interacts only with the L10 protein, does not bind to rRNA, and cannot independently regulate translation [23, 74]. This is the only r-protein that is present on the ribosome in multiple copies. In E. coli and other mesophilic bacteria, a pentameric complex L10∙(L12)4 is formed, while thermophilic bacteria contain a heptameric complex L10∙(L12)6 [7577].

The main contribution to the recognition of the L10 protein on the ribosome is from the “kink–turn” consensus motif (H42–44 of 23S rRNA) in the region of the GTPase center (Fig. 10b) [57, 78], similar to the L10-binding site of rplJ mRNA of E. coli (Fig. 10a) [79]. The secondary structure of mRNA plays an important role in interaction with the protein, removal of any part of this helix reduces the efficiency of regulation [80, 81]. It is known that unpaired nucleotides and bulges play a leading role in the binding of many r-proteins to RNA [8286].

Fig. 10.
figure 10

The secondary structure of L10⋅(L12)4 binding sites of E. coli RNA. (a) The secondary structure of the L10⋅(L12)4 binding site of rplJL mRNA. (b) The secondary structure of the L10⋅(L12)4 binding site of 23S rRNA. The L10⋅(L12)4 binding sites are shown in red. The rplJ mRNA start codon is highlighted in green.

It was shown that two conserved adenines in the UUAA mRNA bulge are protected from chemical reagents by the L10∙(L12)4 complex (Fig. 10a) [87]. In all organisms, bulge is located at a distance of 4 bp from the “kink–turn” consensus motif. The conserved loop of UAA 23S rRNA has the same location relative to the “kink–turn” motif as the conserved adenines in the mRNA bulge (Fig. 10) [79]. Substitutions of these adenines in both mRNA and rRNA reduce the affinity of L10∙(L12)4 to RNA.

The L10 operon regulation model is based on the two alternative conformations of 5'-UTR of rplJ mRNA. It was assumed that the operon is regulated as a result of competition between the ribosome and the repressor for binding to mRNA in an “open” or “closed” conformation (Fig. 11a). Binding of L10∙(L12)4 leads to a change in the secondary structure of mRNA with formation of a double helix between SD of the rplL cistron and 5'-proximal site of mRNA; eventually a “closed” conformation of mRNA is formed (SD unavailable) [88] (Fig. 11a). This structure can become a target for RNAses specific to double-stranded RNAs (RNAse III), which leads to a decrease in mRNA stability.

Fig. 11.
figure 11

Schemes for models of the regulation of the rplJL mRNA expression. (a) The model of coordinated expression of rplJL mRNA translation in E. coli. SD sequences are indicated in green. (b) The autogenous transcription attenuation model of the L10 operon of B. subtilis. The gray square indicates the termination site. The nucleotide residues between the AT and AAT hairpins are highlighted in yellow, and the nucleotide residues between the AT and T hairpins are highlighted in blue. Green indicates the SD of the gene encoding the leader peptide and rplJ. The conserved “kink-turn” motif that the L10⋅(L12)4 complex interacts with is highlighted in red.

Synthesis of rplJL mRNA of B. subtilis is regulated by transcription attenuation [89, 90]. The leader section of this mRNA contains three overlapping hairpins. The internal transcription terminator and the AT hairpin in front of it act as an antiterminator of transcription, and the AAT structure located above it acts as an anti-antiterminator (Fig. 11b). L10∙(L12)4 of B. subtilis stabilizes the AAT hairpin of mRNA (Fig. 11b) preventing the formation of an AT hairpin and ensuring transcription termination. The leader site of rplJL mRNA of B. subtilis encodes a leader peptide whose SD is located at the 3' end of the terminator hairpin (Fig. 11b). Translation of the leader peptide can increase the expression of rplJL of B. subtilis by blocking the availability of the transcript for the ρ-factor. Thus, the L10 operon of B. subtilis is regulated by both the attenuation mechanism and the antitermination mechanism performed by the leader peptide. This dual post-transcriptional control can provide fine-tuning of rplJL expression in B. subtilis depending on the cell growth phase. A similar rplJL leader region structure, which includes the leader peptide sequence, was found in other Bacillus species [89].

REGULATION OF RIBOSOMAL PROTEIN SYNTHESIS BY THE “ENTRAPMENT” MECHANISM

The mechanism of regulation of r-protein synthesis by the “entrapment” principle involves the formation of an “inactive” mRNA structure upon binding a repressor protein, which blocks the ribosome at the preinitiation stage. Autoregulation by the “entrapment” mechanism of synthesis of the S15 r-protein in E. coli is the best studied example. It is assumed that the “entrapment” mechanism operates at a lower affinity of the protein to mRNA than in the competitive mechanism, since the repressor only needs to stabilize the unproductive initiation complex, whereas in the competitive mechanism, the repressor and the ribosome must compete for binding to mRNA [91].

S15 Operon

The S15 operon consists of two genes, rpsO (it encodes the S15 r-protein) and pnp (it encodes polynucleotide phosphorylase). These two genes are cotranscribed in E. coli. S15 inhibits the translation of its own mRNA when it is over-synthesized. The operon contains two promoters: P1, which is before the rpsO gene, and a weak promoter P2, which is between the rpsO and pnp genes (Fig. 1h). In addition, there is a sequence in the region between the genes that forms a ρ-independent terminator, and there is a hairpin in the region below P2 that contains the RNase III recognition region (Fig. 1h) [92]. When this hairpin is destroyed, a duplex structure is formed with an unordered 3'-terminal part, which is cleaved by the polynucleotide phosphorylase itself, leading to destabilization of pnp mRNA. Thus, polynucleotide phosphorylase regulates its own synthesis at the post-transcriptional level via RNase III-dependent pathway [93].

The single-domain S15 protein interacts with 16S rRNA in two sites (Fig. 12a) and plays a major role in the assembly of the central domain of the 30S ribosome subunit [63]. The main protein binding site is formed by 20–22 16S rRNA helices and a GGC site at their junction. The other site is a conserved G∙U/G-C motif located at a distance of one turn of the helix from the three-helices junction.

Fig. 12.
figure 12

The diagrams of S15 binding sites on RNA. (a) An rRNA fragment of E. coli; site 1 (GGC) is marked in red; site 2 (G⋅U/G-C motif) is marked in blue. (b–e) The schemes of the S15 binding sites of mRNA of E. coli (b), R. radiobacter (c), T. thermophiles (d), and G. kaustophilus (e). The start codon and SD sequence are marked. The G⋅U/G-C motif is outlined in blue, the three helices junction are shown in red; additional binding sites of the S15 protein on mRNA are shown in green.

5'-UTR of rpsO mRNA, in contrast to the S15-binding site on rRNA, has a non-conserved structure. Differences between protein homologs in E. coli (mRNAEco, EcoS15), Geobacillus kaustophilus (mRNAGka, GkaS15), T. thermophilus (mRNATth, TthS15), and Rhizobium radiobacter (mRNARra) determines the specificity of recognition of various mRNA structures in these organisms (Figs. 12b–12e). Thus, even conserved RNA-binding proteins of different bacteria may have a different RNA-recognizing module, which indicates the co-evolution of bacterial homologues of the S15 protein and specific sites on the mRNA. Not any rpsO mRNAs are regulated by S15 protein homologues. For example, the translation of mRNAEco does not change in response to the addition of GkaS15, and mutations in mRNAGka and mRNARra do not affect the interaction with the S15 protein of other organisms [94].

The S15-binding site of E. coli mRNA, like rRNA, has two spatially separated binding sites. The main mRNAEco site is similar to the G∙U/G-C motif of 16S rRNA; the other site only slightly resembles the junction of three helices. In T. thermophilus, on the contrary, the main binding site is similar to the junction of three 16S rRNA helices, and the G∙U/G-C motif has a replacement, G∙G/G-C (Fig. 12d). The binding sites of mRNAGka for the S15 protein are similar to both binding sites on 16S rRNA (Fig. 12e). The fourth variety, mRNARra, has a conserved G∙U/G-C site and a structure resembling a triple knot (Fig. 12c) [94]. Introduction of mutations to the G∙U/G-C motif in E. coli and G. kaustophilus and to a site resembling the triple knot structure in T. thermophilus and R. radiobacter leads to inactivation of the regulatory function of the S15 protein [95]. The EcoS15 amino acid residues that recognize the small groove of h22 in 16S rRNA are also involved in the recognition of the corresponding helix of the mRNA pseudoknot [96].

In the absence of the S15 protein, rpsO mRNA can be either in a pseudoknot or a double hairpin conformation, but only the pseudoknot mRNA binds to the ribosome (Fig. 13a). rpsO mRNA of E. coli translation is inhibited when the ternary complex (S15∙30S∙mRNA) transits to an inactive conformation [97]. The mRNA pseudoknot site interacts with the N-terminal domain of the S2 protein on the ribosome, and the site in front of the pseudoknot contacts with h26 of 16S rRNA. At the same time, SD remains available for interaction with the ribosome, since it is located inside a large loop of the pseudoknot. S15 stabilizes this state of mRNA by binding to the ribosome, blocks the ribosome in the preinitiation state and prevents the transition of the start codon to the decoding center (Fig. 13a), which inhibits the codon-anticodon interaction in the P-site [98]. This is a classic example of regulation based on the “entrapment” principle.

Fig. 13.
figure 13

The illustration of the mechanism of regulation of the S15 protein synthesis. (a) The “entrapment” mechanism that inhibits the translation of rpsO mRNA in E. coli. (b) The model of the competitive mechanism of regulation of translation of the S15 r-protein of T. thermophilus. rpsO mRNA either binds to TthS15 and is destroyed, or binds to the 30S subunit, forming an active initiation complex.

The S15 mRNA regulatory site in T. thermophilus is formed by three stem-loops (Fig. 12d) that do not have structural similarity to the pseudoknot of mRNA of E. coli (Fig. 12b). The mRNA structure undergoes conformational changes upon interaction with the TthS15 protein, which results in the formation of an mRNA structure similar to the protein binding site on 16s rRNA. As a result, the site of binding to the 30S ribosomal subunit on mRNA becomes inaccessible (Fig. 13b) [95], and translation is inhibited [99]. The affinity of EcoS15 to its own mRNA is two orders of magnitude lower than that of TthS15 for its mRNA. It is believed that the regulation of rpsO mRNA translation in T. thermophilus follows a competitive mechanism, rather than the “entrapment” mechanism, as in E. coli.

S4/α Operon

The α operon of E. coli includes five r-protein genes and the RNA polymerase α-subunit gene (Fig. 1i). Synthesis of the operon proteins, except for the α-subunit of RNA polymerase, is regulated by the S4 protein [100, 101]. S4 together with the S7 protein initiates the assembly of the 30S ribosomal subunit [102, 103]. It interacts with the 5' domain of 16S rRNA at the intersection of five helices (h3, h4, h16–h18) [104106]. It is known that the formation of functional complexes of the S4 protein with both mRNA and rRNA requires only its N-terminal part [107].

The structures of the binding sites of the E. coli S4 protein on mRNA and rRNA differ. The regulatory region on the α-operon mRNA is a double pseudoknot (Fig. 10a) [108] that includes 5'-UTR and the beginning of the coding region of rpsM mRNA [109]. The H1 mRNA helix has almost no conserved nucleotides, and its folding is most important for regulation (Fig. 10a). The rpsM mRNA start codon (GUG) is 4 times less effective than AUG; however, the replacement of GUG with AUG reduces the expression of α‑operon genes in vivo six fold [110]. Apparently, the initiation of translation of α operon mRNA, as well as its repression, depends on the structure of this part of the mRNA.

The “entrapment” mechanism of regulation of the α operon, as in the case of the S15 operon, is based on the conformational switch between two structures of the mRNA pseudoknot and the formation of an “inactive” preinitiation complex (Fig. 14b) [91, 109, 111]. The preinitiation complex is formed only with the “active” conformation of the mRNA pseudoknot. The “inactive” mRNA conformation forms a complex with the 30S subunit, but cannot bind tRNAfMet. The S4 protein plays the role of an allosteric repressor that shifts the balance between the two mRNA conformations towards the “inactive” form [109, 112]. When translation is regulated by the “entrapment” mechanism, there is no need for a high affinity of the repressor to mRNA; the affinity of the S4 protein of E. coli to mRNA and rRNA is approximately the same [113, 114].

Fig. 14.
figure 14

The regulation of the α operon. (a) Secondary structure of the S4 binding pseudoknot of rpsM mRNA of E. coli. The SD and start codon sequences are highlighted in green. Conserved nucleotides are highlighted in red. (b) Diagram of the translation repression of the E. coli α operon by the S4 protein.

In many eubacteria, the S4 protein gene (rpsD) is not part of the α operon. Thus, the rpsD gene of B. subtilis is an autoregulated transcription unit [115]. The secondary structure of the S4-binding site of B. subtilis mRNA does not have a pseudoknot conformation [11], which indicates a difference in the principles of recognition of the regulatory site by the S4 protein in B. subtilis and E. coli.

It should be noted that the S4 protein performs the same function on ρ-dependent terminators as the NusA transcription factor in antitermination of transcription [116]. Thus, this protein not only inhibits the translation of the α operon, but also stimulates rRNA transcription, maintaining a balanced synthesis of rRNA and r-proteins.

REGULATION OF RIBOSOMAL PROTEIN SYNTHESIS BY RETROREGULATION

Retroregulation is degradation of mRNA by ribonucleases as a result of interaction of the repressor protein with the distal mRNA site. The inhibition of the synthesis of the L14–L24 S8/spc operon proteins is a well-studied example of this type of regulation.

S8/spc Operon

The S8/spc operon of E. coli consists of 12 genes, the expression of 10 of which is inhibited when the S8 r-protein binds to the intercistronic region of mRNA that includes the start codon of the rplE gene (Fig. 1j).

The conserved two-domain S8 protein plays an important role in the assembly of the 30S ribosome subunit; it binds mainly to the helix 21 of 16S rRNA [48]. The binding sites of the S8 protein on E. coli mRNA and 16S rRNA are very similar [117, 118] (Figs. 15a, 15b), however the affinity for mRNA is 5 times lower than for a specific fragment of 16S rRNA [117].

Fig. 15.
figure 15

The regulation of the S8/spc operon. (a, b) Secondary structures of the S8 protein binding sites on mRNA (a) and rRNA (b) in E. coli. The frame shows the region with which the S8 protein interacts. SD and the start codon of the L5 protein mRNA are highlighted in green. Conserved mRNA nucleotides are shown in red. (b) The model of “retroregulation” of L14 and L24 protein synthesis by S8 protein (a part of the S8/spc operon is shown).

The regulatory site is located in the 5'-terminal part of the L5 protein mRNA (Fig. 15c) [119, 120]. In the regulatory complex, the protein interacts mainly with the inner loop of the mRNA helix [117]. Unpaired A8 and A9 and the G12–C79 mRNA pair (Fig. 15a) are conserved [11]. Contacts of the S8 protein with the inner loop of a specific mRNA fragment are similar to contacts with 16S rRNA [48, 117].

The S8 protein directly blocks L5 protein translation by binding to mRNA at the beginning of its gene. Interruption of translational coupling leads to inhibition of translation of subsequent protein cistrons (Fig. 15c) [121]. However, only three operon genes lack SD, and some of them are located at the 3' end of the S8 binding site (Fig. 15c). A mechanism for regulating the expression of the spc operon genes involving endonucleases and subsequent mRNA degradation by exonucleases was proposed. According to this model, after the action of endonuclease (for example, RNase III), L14–L24 mRNA is degraded by 3'-, 5'-exonucleases (polynucleotide phosphorylase and/or RNAse II) (Fig. 15c). Thus, the inhibition of translation of the cistrons of L14 and L24 proteins occurs by retroregulation, which leads to mRNA destabilization [122]. There is no information about the regulation of translation of the most remote secY and rpmJ cistrons (Fig. 1j).

S7/str Operon

The S7/str operon of E. coli consists of four genes encoding the S12 and S7 r-proteins and the elongation factors EF-G and EF-Tu (Fig. 1k). The S7 protein inhibits the synthesis of the S12 and S7 r-proteins [123] and EF-G; and the translation of mRNA of the S7 and S12 proteins is coupled [124]. The S7 protein of E. coli initiates the assembly of the 30S ribosomal subunit: initiates the folding of the 3' major domain of the 16S rRNA and promotes the binding of other r-proteins that form the head of the 30S subunit [103]. S7 interacts with a short fragment of 16S rRNA comprising two multibranch loops in the lower part of the 3' major domain (Fig. 16b) [125].

Fig. 16.
figure 16

The secondary structures of S7 binding fragments of mRNA (a) and rRNA (b). Identical sequences required for binding are shown in frames. The stop codon of the S12 protein (UAA) on the mRNA fragment is highlighted in gray, the SD sequence and the S7 initiation codon are highlighted in green.

The protein binding site on S7/str mRNA of the E. coli operon is located between the rpsL and rpsG genes [123] and contains an irregular hairpin (Fig. 16a). The minimal mRNA fragment that retains an affinity for S7 includes the intercistron region of str mRNA. The secondary structures of protein binding sites on mRNA and rRNA differ, and both RNAs contain two identical S7-binding sites (Fig. 16). The first site is the h42 16S rRNA and the three-helical junction of str mRNA [124]; the second site is the B and A rRNA loops and the lower part of the helix III of mRNA [126].

Upon interaction with the S7 protein, the helix V of mRNA is destabilized and translation of the S7 and EF-G mRNA proteins is repressed; translation of the S12 cistron can be inhibited by retroregulation [124, 127].

The structure of the S7/str operon mRNA region is not conserved. The S7/str operon of T. thermophilus does not contain an extended region between the rpsL and rpsG genes, and in Cyanobacteria this sequence is more similar to a specific region of 16S rRNA [11].

Autoregulation of the S20 Operon

The mechanism of autoregulation of the S20 operon has not been determined as yet, and there is no data on the specific site of binding of the S20 protein on mRNA. It has been shown that ppGpp-dependent regulation of S20 protein synthesis occurs when amino acids are deficient, and this regulation requires a leader sequence of mRNA [128].

The S20 protein is one of six primary 16S rRNA-binding proteins of the 30S ribosome subunit. S20 interacts with two sites of 16S rRNA, namely the helices 9, 11, 13 and helix 44, thereby connecting the 5' domain and the 3' minor domain of 16S rRNA [129]. Removal of the S20 protein reduces the speed and efficiency of mRNA binding to the ribosome, and disrupts the assembly of the 30S subunit [130].

Synthesis of the S20 r-protein is autoregulated at the post-transcriptional level (Fig. 1l) [131]. It was assumed that S20 binds to a site in 5'-UTR of its own mRNA and thereby blocks SD and the start codon. It is worth noting that the UUG sequence is the start codon of the rpsT gene encoding the S20 r-protein [132]. The UUG start codon of rpsT mRNA and its flanking nucleotides is the minimum site required for effective regulation of S20 synthesis. Replacing UUG with AUG reduces the inhibitory effect of the S20 protein [133].

Despite information about the ability of the S20 protein to inhibit its own synthesis both in vivo and in vitro, data on its interaction with rpsT mRNA fragments could not be obtained [133, 134]. It was assumed that the regulation of S20 protein synthesis requires its interaction not only with a part of its mRNA, but also with the preinitiation complex of mRNA and 30S subunit [133].

CONCLUSIONS

Synthesis of the r-proteins is regulated by two rather similar mechanisms, competition between protein binding sites on rRNA and mRNA and the “entrapment” mechanism. When translation is competitively inhibited, the repressor binding site on mRNA may be similar to the binding site of this protein on rRNA (L1 protein), or may have virtually no homology (S2 protein). Regulation of protein synthesis by the “entrapment” mechanism (S15 operon and S4/α operon) is carried out by blocking the ribosome in the preinitiation state upon binding the regulatory protein. Moreover, the regulatory binding sites of both the S4 and S15 proteins on E. coli mRNA have a pseudoknot structure. The regulatory region of the L20 operon mRNA of E. coli also has a pseudoknot structure, however the regulation of protein synthesis of this operon does not follow the “entrapment” mechanism, it is implemented via competition between the repressor and the ribosome for binding to mRNA. Despite the fact that the “entrapment” regulation mechanism is more effective than the competition mechanism, since it can function even if the repressor concentration is low or its affinity for mRNA is weak, the translation regulation of most bacterial mRNAs is subject to a simple competition between the repressor and the 30S subunit for interaction with mRNA. The retroregulation mechanism involving mRNA destabilization upon interaction with the repressor protein (regulation of synthesis of the L14 and L24 proteins by S8 protein and, probably of the synthesis of S12 protein by S7 protein) may also be used in gene expression of r-proteins.

Despite the conserved properties of ribosomal regulatory proteins, the sites of their binding to mRNA and the mechanisms of protein synthesis regulation may differ from one organism to another. As an example, the S15 protein of T. thermophilus regulates the translation of the S15 operon by a competitive mechanism, and not by the “entrapment” mechanism, as in E. coli. In B. subtilis and some other bacteria, the binding sites of the L20, S15, S7, and S4 r-proteins on mRNA differ from the regulatory sites of these proteins in E. coli. Expression of B. subtilis L10 operon genes can be regulated not only by the competition mechanism of the ribosome with the protein complex for mRNA binding, as in E. coli, but also by antitermination.