INTRODUCTION

The nucleolus is the main domain of the cell nucleus. It is not separated by membrane from the nucleoplasm; it is formed near the regions of chromosomes encoding repeats of ribosomal genes (rDNA). The canonical function of the nucleolus is the biosynthesis of ribosomes. Currently, the functions of the nucleolus include participation in regulation of the cell cycle, apoptosis, the development of viral infections, and processes of cellular aging (Núñez Villacís et al., 2018). New data also link the nucleolus with the genome-stability control and the development of human, malignant neoplasms (Lindström et al., 2018). Thus, the nucleolus is a multifunctional and significant cellular compartment, the noncanonical functions of which are actively studied at present. According to data from mass spectrometric analysis, the nucleolus of human cells contains about 6000 proteins, most of which are represented by factors involved in the transcription of ribosomal genes, the processing of newly formed pre-rRNA transcripts, and the assembly of ribosomal particles. About a third of the proteins identified within the nucleoli are poorly understood, and their role in the cellular metabolism is still unclear (Tafforeau et al., 2013). These proteins include, in particular, the SURF6 nucleolus protein. SURF6 was first described in 1996 as an expression product of the Surf6 gene, a members of the Surfeit locus in the murine genome (Magoulas and Fried, 1996). The data available to date suggest that SURF6 is involved in both ribosome biogenesis and cell-cycle regulation. However, the currently available information does not allow more definite conclusions on the role of SURF6 in the metabolism of cells of higher eukaryotes, and this protein is still one of the least studied proteins of the mammalian nucleolus.

Surfeit LOCUS

The Surfeit locus, a conservative gene locus, is widespread in vertebrates; it was found in the genomes of humans, mice, chicken (Duhig et al., 1998), clawed frog (Wolff et al., 2002), and puffer fish (Armes et al., 1997; Wolff et al., 2002) (Fig. 1). This suggests that the organization of the Surfeit locus plays a significant role in gene expression in vertebrates. In the puffer fish Fugu rubripes, genes of the Surfeit locus are located in three different regions of the genome, but their structure is homologous to the structure of genes of the Surfeit locus in mammals (Armes et al., 1997). Invertebrates, such as the fruit fly D. melanogaster and nematode C. elegans, do not have the Surfeit locus that is characteristic of vertebrates. They have genes that are homologous to genes of the Surfeit locus, which are located on different chromosomes, and the expression products of these genes also have homologs in vertebrate species (Armes et al., 1997).

Fig. 1.
figure 1

Surfeit loci of mouse (M. musculus) and human (H. sapiens). Genomic DNA is indicated by the thick line, and CpG-rich regions are shown as black rectangles. The direction of transcription of Surfeit locus genes and pseudogenes ψ is indicated by arrows; the distance between genes is also indicated, where bp is a base pair, kb is a thousand base pairs (Magoulas and Fried, 2000).

The Surfeit locus of mammals is a tight cluster of six unrelated housekeeping genes (Surf-1, Surf-2, … Surf-6) (Huxley and Fried, 1990). It was shown that nucleotide sequences of six genes of the Surfeit locus and the amino acid sequences of their expression products have no homology with each other and with other genes and proteins. The Surfeit locus has unique properties: overlapping gene sequences, bidirectional transcriptional promoters, and a very tight arrangement of genes (Armes et al., 1997).

The direction of transcription for five of the six genes of the Surfeit locus (except for the Surf-6 gene) is opposite the transcription direction of the neighboring gene; moreover, the 5'-ends of each of the six genes are associated with regions of genomic DNA enriched in cytosine and guanine (CpG enriched regions) and do not contain the consensus TATA sequence (Fig. 1), which is typical of housekeeping genes (Armes et al., 1997). In particular, the methylation of such regions leads to suppression of the activity of the promoter located in this region (Antequera et al., 1989).

Compared with the location of other genes, the Surfeit locus genes have rather small intergenic distances: they are separated by no more than 73 base pairs, while the most closely located mammalian genes are separated by tens and hundreds of thousands of base pairs. The unusually tight arrangement of genes typical of Surfeit locus genes is conserved in mammals and birds. This fact suggests that such an organization may have biological significance and play an important functional or regulatory role. Such a tight arrangement of genes within the locus may also indicate cis-interaction of the genes or mutual regulation of gene expression of this locus in vertebrates (Armes et al., 1997).

It should be noted that the promoter region of human Surf-1 and Surf-2 genes contains four binding sites for transcription factors. Two of them were identified and represent Sp1 and YY1 proteins (Cole and Gaston, 1997). It is known that Sp1 and YY1 transcription factors interact with each other and with the c-Myc transcription factor (Seto et al., 1993). The c‑Myc transcription factor is a nuclear protein that regulates the expression of genes involved in the processes of cell proliferation, differentiation, and apoptosis (Marcu et al., 1992; Stine et al., 2015). It was shown that the activation of the Surf-1 promoter is achieved via the binding of YY1 and c-Myc transcription factors in response to the addition of growth factors. In addition, it was found that expression of the Surf-6 gene in D. melanogaster is also activated by the c-Myc transcription factor (Orian et al., 2003). With the use of rat fibroblasts constantly expressing c-Myc, 38 genes encoding proteins with nucleolar localization, the expression of which depended on the level of the endogenous c-Myc factor, were identified (Schlosser et al., 2003). These genes include the human Surf-6 gene, as well as genes of other nucleolar proteins: nucleolin, B23/nucleophosmin, fibrillarin, and Nopp140. All of these proteins play key roles in ribosome biogenesis. It can be assumed that, despite the absence of the Surfeit locus in invertebrates and, in particular, in D. melanogaster, Surf genes may have similar mechanisms of Myc-dependent activation.

The expression products of each of the six Surf genes have been identified. Proteins encoded by Surf genes have different functions and have interspecies homologs that belong to the SURF1, SURF2, … SURF6 protein families presented in the PFam protein family database (http://www.sanger.ac.uk/Software/Pfam/).

Thus, proteins belonging to the SURF1 family are transmembrane mitochondrial proteins that are presumably involved in cytochrome c biogenesis. The protein encoded by Surf-1 is a component of an intermediate link in mitochondrial regulation of translation of the mitochondrial translation regulation assembly intermediate of cytochrome c oxidase complex (MITRAC), which is involved in the regulation of cytochrome c oxidase assembly. Surf-1 defects are the cause of Lee syndrome, a severe neurological disorder usually associated with systemic deficiency of cytochrome c oxidase (complex IV), and Charcot–Marie–Tooth disease, a sensorimotor polyneuropathy (Smith et al., 2005).

SURF2 is a conserved protein localized in the nucleus and nucleolus, the function of which is unknown.

The Surf-3 gene encodes the L7a ribosomal protein.

Surf-4 encodes a conserved integral membrane protein that contains several putative transmembrane regions and is associated with the endoplasmic reticulum. The specific function of this protein has not been determined, but its yeast homolog is directly required for the packaging of glycosylated pro-alpha factor into COPII vesicles. This gene uses several polyadenylation sites, which leads to a change in the transcript length (Yin et al., 2018).

Surf-5 encodes a protein component (MED22) of the mediator complex, which functions in the transcription regulation due to the bridging of interactions between gene-specific regulatory factors, RNA polymerase II, and general transcription factors. Alternatively spliced transcript variants encoding different isoforms were observed (Sato et al., 2003).

Finally, the Surf-6 gene encodes the SURF6 protein, which is a protein of the nucleolar matrix (Magoulas et al., 1998). Surf-6 is expressed in all tissue types (Trott et al., 2001), but its function is still not precisely known.

Thus, the Surfeit locus is unique both in the structure and mutual arrangement of genes and in the variety of functions of proteins encoded by Surf genes.

It should be noted that a pseudogene located at a distance of 68 bp from the 3'-end of the Surf-5 gene was found in the murine Surfeit locus. The pseudogenic sequence is also found in the human locus and corresponds to the rpL21 ribosomal protein (Trott et al., 2001). The presence of pseudogenes scattered throughout the genome is typical of ribosomal proteins; in particular, rpL21 has 145 pseudogenes (Kovalenko and Patrushev, 2018).

It is also worth noting that the human genome contains one Surf-6 pseudogene, while 53 pseudogenes were identified for Surf-3 (L7a ribosomal protein). Pseudogenes for other members of the Surfeit locus were not identified in humans (http://pseudofam.pseudogene.org/).

Analysis of the human Surf-6 cDNA sequence showed that the open reading frame begins with the first ATG, which is surrounded by a sequence similar to the sequence deduced by Kozak and serves for the efficient initiation of translation in mammals. Several binding sites for transcription factors are located upstream of the site of the beginning of the Surf-6 gene transcription; however, as in the case of other genes of the Surfeit locus, there is no consensus TATA sequence (Magoulas and Fried, 2000).

PROPERTIES OF THE SURF6 PROTEIN

The features of the SURF6 proteins family include relatively small sizes (in mice and humans, 355 and 361 a.a., respectively); enrichment with lysine and arginine residues (28%), which give proteins a large positive charge (isoelectric point pI is about 10.5); the lack of any consensus functional sequences; and predominantly nucleolar localization. In humans, expression of the Surf-6 gene was described in tissues of all organs used for analysis: the pancreas, kidney, muscles, liver, lung, placenta, brain, and heart (Magoulas and Fried, 2000). Human SURF6 was shown to contain ten potential phosphorylation sites (results of SURF6 amino acid sequence analysis with the Prosite database (http://prosite.expasy.org)).

It was determined with Western blots that the SURF6 protein has an electrophoretic mobility of about 46 kDa in humans and 43 kDa in mice. SURF6 is a highly hydrophilic protein (Magoulas and Fried, 2000). The amino acid sequence of the SURF6 protein has five potential sites for casein kinase II phosphorylation (Magoulas and Fried, 2000). Apparently, these sites are active, since the binding of SURF6 to casein kinase II in D. melanogaster was shown with the yeast two-hybrid system. It is believed that casein kinase II phosphorylation can affect the activity of nucleolus functions during the cell cycle and can control the activity of specific nucleolar proteins (Trott et al., 2001).

The nucleotide sequence of the Surf-6 gene is unique, since no genes homologous to the Surf-6 gene were found in an analysis of the nucleotide sequence databases. However, as shown, proteins of various taxonomic groups (from humans to yeasts) have a highly conserved domain, which allows them to be combined into one family, the SURF6 protein family (Polzikov et al., 2005) (Fig. 2).

Fig. 2.
figure 2

Aligned amino acid sequences of the C-terminal conserved region of the SURF6 molecule (SURF6 domain) in representatives of various taxonomic groups. The average length of the SURF6 domain is 191 amino acid residues (a.a.); the percentage of a.a. identity within the domain in different species is 36%. The length of the SURF6 domain and the number of a.a. in the SURF6 molecule are indicated for each species. Sequences were aligned with the ClustalW method. Shading was performed for conservative amino acid residues (black color shows 100% a.a. identity in all species; gray color indicates less than 100% but more than 70% a.a. identity; light-gray color denotes less than 70% but more than 50% a.a. identity). The secondary structure is indicated in the figure as follows: rectangle, α-helical regions; arrow, β-folded layer; straight line, disordered regions (Polzikov et al., 2005).

The SURF6 domain is located in the C-terminal region and has an average length of 191 a.a. with a 36% interspecies identity of amino acid residues. The N‑terminal region of proteins of the SURF6 family has different lengths and does not show significant homology between proteins of the family in different species. Within the domain, there is a highly conserved core consisting of about 60 amino acids, nine of which are conserved in all species and are located between tryptophan and asparagine residues. The conserved domain of SURF6 has a predominantly α-helical structure. It was suggested that the amino acid residues that make up the core of the domain can play an important role both in the process of protein folding and in processes of the molecular interaction of SURF6 with protein partners and nucleic acids in vivo (Polzikov et al., 2005).

It was shown via affinity chromatography and nitrocellulose filters with DNA or RNA adsorbed on them that murine SURF6 can bind to nucleic acids in vitro, and the bond with RNA was stronger than with DNA (Magoulas et al., 1998). It was also shown that the effect of RNase A on cells completely prevents the immunocytochemical staining of nucleoli with antibodies to SURF6. Since the main type of RNA in nucleoli is rRNA at different stages of maturation, this observation is evidence in favor of the possible interaction of SURF6 with rRNA in situ. At the same time, the treatment of cells with DNase I did not affect the SURF6 localization (Gurchenkov et al., 2005), which may indicate the absence of an association between SURF6 and rDNA in situ. This is also indirectly indicated by the behavior of SURF6 in mitosis: all specific proteins of the nucleolus that are associated with rDNA, including RNA polymerase I and its transcription factors, and proteins UBF, TBP, and TAF retain their association with the nucleolar organizer regions during mitosis. In contrast to these proteins, SURF6 was located on the surface of chromosomes, in the cytoplasm, and within the prenucleolus, like fibrillarin and B23, the proteins involved in rRNA maturation (Gurchenkov et al., 2005).

INTRACELLULAR LOCALIZATION OF THE SURF6 PROTEIN

It was found via immunocytochemical electron microscopy that the SURF6 protein in mouse cells is localized mainly in the granular component (GC) (Magoulas et al., 1998) (Fig. 3), where rRNA is processed, and it is included in the nucleolar matrix. SURF6 is present in nucleoli during the cell cycle, and its localization remains almost the same in all phases of the cycle. In this case, the areas of the SURF6 localization significantly, but not completely, overlap with the localization areas of B23 and fibrillarin, and the degree of SURF6 and B23 colocalization is higher than the degree of SURF6 and fibrillarin colocalization (Magoulas et al., 1998).

Fig. 3.
figure 3

Immunocytochemical detection of the SURF6 protein in mouse cells of the NIH/3T3 line with specific antibodies (Polzikov et al., 2012).

The SURF6 distribution during mitosis is similar to the distribution of proteins involved in rRNA processing. When cells enter mitosis and the nucleolus begins to disintegrate in prophase, the protein migrates from the nucleoli to the nucleus, where it is located in the interchromosomal regions. In the prometaphase and metaphase, SURF6 is localized on the chromosome surface and in the cytoplasm, where it remains partially associated with residual nucleoli. In anaphase, SURF6 is located mainly in the cytoplasm and weakly decorates chromosomes. In the early telophase, the protein is located on the perichromosomal region of chromosomes. In telophase and late telophase, the protein is clearly found in the prenucleolar bodies and is practically absent in the cytoplasm. It was revealed that the SURF6 protein is not associated with mitotic nucleolar organizer regions (rDNA). Also, the SURF6 protein was found within cytoplasmic inclusions with the localization of B23 and fibrillarin, although the dynamics of its movements differs from the behavior of these proteins. Thus, during the transition at the beginning of mitosis, SURF6 leaves the residual nucleolus later than B23 but earlier than fibrillarin, and, at the end of mitosis, SURF6 disappears from the cytoplasm later than fibrillarin but earlier than B23. In general, the behavior of SURF6 in mitosis suggests that it exhibits properties inherent in proteins involved in rRNA processing rather than transcription (Gurchenkov et al., 2005).

It was shown that the migration of proteins of the SURF6 family into nucleoli is enabled by evolutionarily conserved signals of nuclear localization and the targeted delivery of proteins to the nucleolus. The primary structure of these proteins contains multiple nuclear localization signals (NLSs). These signals can be either single or double. The largest number of NLSs is contained in conserved SURF6 domains of proteins (Polzikov et al., 2005).

To confirm experimentally the presence of NLS in the murine SURF6 sequence, a series of genetic constructs encoding the green fluorescent protein EGFP with different regions of the SURF6 protein sequence was created. The localization of chimeric proteins was studied in murine cells of the P19 line. At the same time, the overwhelming majority of truncated SURF6 forms had the ability to migrate into the nuclei and nucleoli of the cells. However, only polypeptides containing the N-terminal portion of the full-length SURF6 protein had a distinct nucleolar localization.

The conserved SURF6 domain is enriched with arginine and lysine residues. These are also a part of double overlapping NLSs, which constitute from 20 to 45% of the SURF6 domain sequence in different species. As shown for more than 79% of the analyzed proteins of the nucleus and nucleolus with known functions, NLSs are included in or directly adjacent to the protein domains responsible for the DNA or RNA binding properties of protein molecules (Polzikov et al., 2005).

INTERACTION OF SURF6 WITH OTHER PROTEINS

Several protein partners of human SURF6 were identified in HeLa tumor cells with two alternative methods, coimmunoprecipitation with specific antibodies to SURF6 and affinity chromatography. Both approaches showed that human SURF6 is associated with multifunctional proteins B23/nucleophosmin and nucleolin, the main cofactor of RNA polymerase I, UBF, and also with the EBP2 rRNA processing factor (Kordyukova et al., 2014a, 2014b).

Fourteen proteins that bind to the conserved SURF6 domain (Surf6_dom) were identified via mass spectrometry and affinity chromatography. These proteins form different functional groups. The main site of the localization and functioning of these proteins is the cytoplasm (for ribosomal proteins RPS9, RPL17, RPL26, translation factors EIF3D, eEF1γ, vimentin) or the nucleus (for other proteins). Five protein partners of Surf6_dom are involved in ribosome biogenesis: a structural protein of the small (40S) ribosome subunit RPS9; structural proteins of the large (60S) ribosome subunit RPL17, RPL26; and the proteins Ku70 and Ku80, which are involved in rDNA transcription. Surf6_dom also forms a complex with proteins involved in mRNA processing (SF3B3, hnRNP C, hnRNPH, hnRNPU), which corresponds to modern concepts of the participation of higher eukaryotic nucleoli in mRNA maturation (Kordyukova et al., 2014a). Thus, SURF6 interacts with factors of ribosome biogenesis involved in the regulation of rDNA transcription, rRNA processing, and the assembly of ribosomal subunits. It also interacts with proteins involved in the regulation of splicing and the cell cycle. In general, these data support the multifunctionality of the SURF6 protein.

It was predicted (Ferrolino et al., 2018) that human SURF6 contains multiple arginine-rich, short, polyvalent, linear motifs (called R motifs) in its primary structure, due to which SURF6 directly interacts and colocalizes with NPM1. It was shown that multivalent R motifs in the disordered N-terminal fragment of SURF6 (S6N; residues 1–182) interact with two acidic regions within the IDR domain of NPM1. At concentrations above saturation, these interactions cause heterotypic separation of liquid and liquid phases (LLPS) due to the electrostatic complementarity of the motifs (Ferrolino et al., 2018; Mitrea et al., 2018). Interestingly, in liquid-like droplets NPM1–S6N, two competing mechanisms are active: heterotypic LLPS of NPM1 with S6N (forming molecular networks NPM1–S6N) and homotypic LLPS of NPM1 (forming molecular networks NPM1–NPM1). It was previously suggested (Mitrea et al., 2018) that the ability of NPM1 to undergo multiple types of LLPS with different classes of nucleolar components (e.g., with rRNA, with ribosomal and nonribosomal proteins containing the R motif) plays a negative role in the maintenance of a liquid-like structural scaffold of nucleolar GC. This buffering ability can compensate for variations in the network of NPM1 partners present in the nucleus, since preribosomal particles are vectorally assemble from fibrillar centers in the center of the nucleolus to GC in the periphery. It was shown that the compositional and physical properties of NPM1–S6N droplets are modulated due to competition between heterotypic mechanisms NPM1–S6N and homotypic mechanisms for the development of NPM1–NPM1 scaffolds and that the interaction between these mechanisms provides dynamic adaptation to changes in the concentrations of SURF6 partners by modulating the available valence of NPM1 and NPM1-dependent molecular scaffold in liquid-like droplets. Thus, NPM1 and SURF6 jointly contribute to the formation and functional regulation of the GC scaffold (Ferrolino et al., 2018).

It was also suggested (Ferrolino et al., 2018) that the role of SURF6 in the regulation of the composition and biophysical properties of the nucleolar matrix extends not only to GC, but also to fibrillar centers and the dense fibrillar component. Through its many competitive interactions with multiple nucleolar proteins, as well as DNA and RNA, SURF6 can dynamically modulate features of the nucleolar scaffold during ribosome biogenesis, which may facilitate the formation of the nucleolar scaffold gradient that directs the assembly of ribosomal particles. It was shown that the inclusion of SURF6 in homotypic NPM1 droplets changes the NPM1 mobility, the droplet viscosity, its composition, and the hydrophobicity. Thus, the discontinuity of the SURF6 concentration in the nucleolus and the related effect on the viscosity and hydrophobicity of the scaffold may contribute to this hypothetical gradient, which facilitates ribosome assembly (Ferrolino et al., 2018). Interestingly, dramatic differences in the local viscosity, hydrophobicity, and surface tension were shown to mediate the compartmentalization of the dense fibrillar component within the GC (Correll et al., 2019). However, further research will be required to test these hypotheses on how different types of competing scaffolds, including numerous proteins and nucleic acids, affect the molecular rearrangements within the nucleolus that accompany ribosome biogenesis.

Rrp14—YEAST SURF6 HOMOLOG

The presence of the conserved SURF6 domain indicated that the Rrp14 nucleolar protein in the baker’s yeast S. cerevisiae, a protein encoded by the ykl082c gene, is a homolog of murine SURF6. The amino acid residue identity between the conserved domains of murine SURF6 and Rrp14 protein is 23%. Like conserved domains of all proteins of the SURF6 family, the Rrp14 protein domain has a highly conserved core of 60 a.a., nine of which remain unchanged in all species (Polzikov et al., 2005). It is known that expression of the ykl082c gene is carried out by the specific yeast transcription factor Rap1, which is involved in the gene transcription of proteins involved in ribosome biogenesis (Planta, 1997).

The assembly of ribosomal subunits in yeast also begins in the nucleolus, where RNA polymerase I transcribes the 35S pre-rRNA precursor. After the processing and removal of external and internal transcribed spacers (ETS and ITS) during maturation, 18S, 5.8S, and 25S rRNA are generated from this precursor (Fig. 4a). After endonucleolytic cleavage of the rRNA precursor at the A2 site in yeast, the pre-40S and pre-60S subunits follow different pathways of biogenesis (Klinge and Woolford, 2019). The modular assembly of the 90S particle provides the basis for domain folding of the 18S rRNA precursor. In contrast to the 40S subunit, the architecture of the 60S subunit, with its six strongly intertwined 25S rRNA domains is more complex (Fig. 4b) (Woolford and Baserga, 2013). In early pre-rRNA, domain VI binds to domains I and II and to the region of the 5.8S rRNA precursor (Fig. 4c). This is where the formation of the polypeptide exit tunnel begins. Its maturation progresses as rRNA domains fold in the following order: VI, V, III, and IV. The complete assembly of the polypeptide exit tunnel is achieved only when domain V is completely folded (Fig. 4c). Thereafter, several nucleoplasmic steps are required for ITS2 cleavage before the particles are exported to the cytoplasm, where they undergo final maturation (Kater et al., 2017). A feature of the pre-60S nucleolar particle is its open architecture, in which domains I, II, and VI, which are open to the solvent, are encapsulated by a number of ribosome assembly factors that form a ring-like structure on the side that is open to the solvent (Fig. 5a). At the same time, various factors acting together sterically prevent premature RNA–protein and RNA–RNA contacts (Sanghai et al., 2018).

Fig. 4.
figure 4

Structure of yeast pre-rRNA (Oeffinger et al., 2007) (a). 5'ETS and 3'ETS, external transcribed spacers; ITS, internal transcribed spacers of the ribosomal gene; 18S, 5.8S, 28S rRNA, regions encoding mature ribosomal rRNA. The sites of enzymatic cleavage are designated by the letters A, B, C, D, and E. The secondary structure of S. cerevisiae 25S and 5.8S rRNA (b). On the right, 25S rRNA contains six domains (I–VI) of the secondary structure. 5.8S rRNA base pairs (black) with domain I (adapted from https://crw-site.chemistry.gatech.edu/). These secondary structures are phylogenetically conserved in all kingdoms, although eukaryotic rRNAs contain segments not found in prokaryotic or archaeal rRNAs (Woolford and Baserga, 2013). The assembly sequence of pre-60S pre-rRNA domains (c). The assembly of ribosomal proteins and biogenesis factors on the forming 35S rRNA precursor begins cotranscriptionally. The formation of the polypeptide exit tunnel (shown here as black circle) begins with the binding of domain VI to domains I and II and to the region of the 5.8S rRNA precursor. Its maturation progresses as the rRNA domains fold in the order VI, V, III, and IV and is completed only when domain V is completely folded, as is observed in State F (Kater et al., 2017).

Fig. 5.
figure 5

Structure of S. cerevisiae early nucleolar pre-60S particle obtained via cryoelectron microscopy (a). Selection of the cryo-EM density of the Rrp14 protein included in precursors of large ribosomal subunits of Saccharomyces cerevisiae pre-60S (b). The image was created in PyMO (Sanghai et al., 2018).

It was proven via equilibrium centrifugation in a sucrose gradient that the Rrp14 protein is a component of early 90S preribosomal complexes and is associated with precursors of 60S ribosomal particles (Oeffinger et al., 2007a). The depletion of the Rrp14 pool with gene knockout leads to almost complete blocking of the processing of 20S pre-rRNA, a precursor of 18S rRNA, and 27S pre-rRNA, a precursor of 25S rRNA, and is accompanied by cell death.

It was also shown that knockout of the SURF6 protein in yeast leads to the accumulation of aberrant A2–C2 fragments (Fig. 4a). Thus, Rrp14 can directly or indirectly prevent premature cleavage of ITS2 at the С2 site. The degradation of ITS2 begins at this site in yeast, without affecting other stages of preribosomal RNA maturation (Oeffinger et al., 2007a).

It was recently shown with cryoelectron microscopy that Rrp14, together with the Ssf1–Rrp15 heterodimer, form a ring structure that encapsulates domains I, II, and VI of 25S rRNA, which align two sides of the forming polypeptide exit tunnel. While the long Ssf1–Rrp15 complex is located at the interface between domains I and VI, the C-terminal conservative helix of Rrp14 connects domains II and VI. In this case, Ssf1 occupies the same position as Rpl31. In later pre-60S particles, Rpl31 binds at the interface between domains III and VI near the polypeptide exit tunnel, which is created by domains I, III, and VI on the side exposed to the solvent (Sanghai et al., 2018; Klinge and Woolford, 2019). At this stage, the ribosomal proteins Rpl7, Rpl13, Rpl17, Rpl24, and Rpl26 are already bound to pre-rRNA. The subsequent stage of maturation of the states is believed to involve the association of domains III and VI and the formation of the polypeptide exit tunnel on the side exposed to the solvent. This may be accompanied by the insertion of the N terminus of the Nog1 protein into the forming tunnel and the replacement of Ssf1–Rrp15 by Rpl31 (Sanghai et al., 2018). The hierarchical sequence for recruiting factors of early biogenesis of the large ribosome subunit was determined via the expression and purification of truncated pre-rRNA pre-60S. It was shown that Rrp14 is associated with the domain IV of 25S rRNA (Chaker-Margot and Klinge, 2019).

Thus, the Rrp14 protein, together with the Ssf1 and Rrp15 proteins (PPAN and Rrp15 in humans, respectively), act as chaperones of domains I and VI, which form two sides of the forming polypeptide exit tunnel before Rpl31 binding and ITS2 removal. The proteins Ssf1/PPAN, Rrp15, Rrp1/Nop52, Ebp2/EBNA1BP2, Brx1/Brix1, NSA1/WDR74, YTM1/WDR12, Rpf1, Erb1/BOP1, Nop7/Pes1, Nop15/MKI67IP, Nop16, Mak16, and Nog1/GTPBP4 are associated with pre-rRNA simultaneously with Rrp14 (Sanghai et al., 2018; Klinge and Woolford, 2019).

It was shown earlier with affinity chromatography and the yeast two-hybrid system that the following proteins interact with the Rrp14 protein in the S. cerevisiae yeast (Table 1). As can be seen, most of the Rrp14 protein partners are involved in ribosome processing. Some of them (Rrp1, Ebp2, Ssf1, NSA1, Nop15, Mak11) were identified in the same ribonucleic complex as Rrp14 (Sanghai et al., 2018). Moreover, proteins Ssf1, LOC1, DBP9, Mak11, like Rrp14, are associated with the domain IV of pre-60S (Chaker-Margot and Klinge, 2019).

Table 1. List of baker’s yeast S. cerevisiae proteins that interact with the Rrp14 protein, a member of the evolutionarily conserved SURF6 protein family

It is important to note that some of the Rrp14 protein partners are involved in the determination of the cell polarity and proliferation. It was shown that yeast strains with a reduced Rrp14 content had defects in the location and elongation of the division spindle during mitosis, which was not previously observed with a decrease in the content of other factors of ribosome biogenesis (Yamada et al., 2007). Therefore, it is logical to assume that the Rrp14 protein can regulate and coordinate important biological processes: ribosome biogenesis, the cell cycle, and cell polarity.

As described earlier, homologs of human proteins Rrp1 and Ebp2 are also protein partners of SURF6 (Kordyukova et al., 2014b), which supports the involvement of SURF6 in the biogenesis of the large ribosome subunit in mammals.

SURF6 PARTICIPATION IN MAMMALIAN RIBOSOME BIOGENESIS

In contrast to lower eukaryotes, the biogenesis of mammalian ribosomes is insufficiently studied. This is mainly due to the inability to purify preribosomal complexes (since nucleoli are not effectively solubilized, without disruptions) with the currently existing extraction protocols (Nieto et al., 2020).

It was shown via equilibrium centrifugation in sucrose gradient that human SURF6 is cofractionated with large subunit precursors (Couté et al., 2006). Nevertheless, the screening of nucleolar proteins based on RNA interference with assessment of the changes in the quantitative content of intermediate pre-rRNA processing products showed that knockdown of SURF6, like that for its partner NPM1, does not lead to significant changes in ribosome biogenesis (Tafforeau et al., 2013). This is possibly associated the lack of the knockdown efficiency.

It was shown that SURF6 overexpression in mouse fibroblasts markedly increases the amount of all intermediate rRNA products except 36S rRNA, which is the longest common precursor for 5.8S and 28S rRNA. The most notable changes concern the accumulation of 45S pre-rRNA (common precursor for 18S, 5.8S, and 28S rRNA formed downstream 47S pre-rRNA), 34S and 20S rRNA (18S rRNA precursors), and 32S rRNA (Moraleva et al., 2017).

It was also shown that SURF6 overexpression in mouse cells leads to a multifold (more than sevenfold) increase in the content of fragments of the second internal transcribed spacer ITS2 and an almost twofold increase in the content of long-lived regions of the 5'-external transcribed spacer 5'ETS. The change in the content of fragments corresponding to 18S rRNA, ITS1, 5.8S rRNA, and 28S rRNA is less notable as compared with ITS2 and 5'ETS.

It was also demonstrated via fluorescence in situ hybridization (FISH) that SURF6 colocalizes with ITS2 and 5'ETS. The available data do not indicate that SURF6 participate in site-specific cleavage of ITS2, but they suggest that murine SURF6 may be involved in the stabilization of ITS2 and 5'ETS (Polzikov et al., 2010), which is consistent with the data obtained for Rrp14, the SURF6 homolog in yeast. The stabilization of ITS2 is likely an evolutionarily conserved function of the SURF6 protein family. The question of the mechanisms of ITS2 stabilization with the participation of SURF6/Rrp14 remains open.

It was suggested that SURF6 binds directly to ITS2, preventing its cleavage by ribonucleases. However, no ribonucleases were described among the protein partners of SURF6, and no proteins with ribonuclease activity were identified among the protein partners of Rrp14.

SURF6 ACTIVITY IN CELL DIFFERENTIATION AND PROLIFERATION

It was experimentally proven that depletion of the Rrp14 protein pool in yeast (Oeffinger et al., 2007a; Yamada et al., 2007), somatic cells (Polzikov et al., 2007), and mouse embryos (Romanova et al., 2006) results in cell death. These data allow us to consider proteins of the SURF6 family as evolutionarily conserved and vital, eukaryotic housekeeping proteins. However, the study of the yeast SURF6 protein, Rrp14, showed that this protein is multifunctional and is necessary not only for the assembly of ribosome subunits and rRNA processing but also for cell polarization and division (Oeffinger et al., 2007a; Yamada et al., 2007).

It was found from a differential display of mRNA and Northern blots that the expression of the Surf-6 gene at the transcript level is regulated during the differentiation of embryonic stem cells. The effect of erythropoietin, a cytokine that plays an important role in the process of the division and differentiation of erythroid cell precursors, on Surf-6 expression was studied. The addition of erythropoietin to a culture medium of embryonic stem cells expressing the erythropoietin receptor and to hematopoietic progenitor stem cells strongly increased the level of Surf-6 gene expression (Xia et al., 2000).

Analysis of the expression of the Surf-6 gene during embryogenesis in the clawed frog (Wolff et al., 2002) showed that the highest expression level was observed at 30–38 stages of embryogenesis corresponding to the lateral phase, and a decrease in expression was observed at eight to nine stages corresponding to the blastula stage and at 40–45 stages corresponding to the beginning of neuron generation in the thalamus before the completion of thalamus formation. The high level of the Surf-6 gene expression at most stages of embryogenesis, as well as the decrease in expression at the terminal stages of embryogenesis, may indicate the participation of the SURF6 protein in the processes of cell differentiation in the clawed frog embryos. Similar results were obtained in a study of Surf-6 expression in adult D. melanogaster as compared with the embryonic stage and larva: the expression peak is observed at 0–12 embryonic stages and at 18–24 h in adult females; the processes of cell differentiation were accompanied by a decrease in the level of SURF6 content (Arbeitman et al., 2002).

Thus, the Surf-6 gene and its possible homologs are important participants in cell metabolism and are presumably associated with processes of cell differentiation and proliferation.

It was shown that induction of SURF6 knockdown in mouse fibroblasts leads to cell death by apoptosis. It was also shown via flow cytometry that knockdown increases the cell number in the G1 period by ~8% and decreases the proportion of cells in the S period by ~7% in comparison with the control. The proportion of cells in the G2 period of the cell cycle and mitosis in both cell populations remained practically the same. These observations indicate that SURF6 knockdown prevents cells from transitioning from G1 to S period of the cell cycle (Polzikov et al., 2007).

As mentioned earlier, the ykl082c and ZK546.14 genes, which encode proteins Rrp14 and CE02914, respectively, are homologs of mammalian SURF6 protein in the yeast S. cerevisiae and the nematode C. elegans. It turned out that deletion via homologous recombination of two copies of the ykl082c gene leads to death in yeast (Giaever et al., 2002). Suppression of the ZK546.14 gene expression in the nematode is accompanied by growth retardation and developmental abnormalities in adults. Depletion of the CE02914 protein leads to the arrest of the embryonic development at the larval stage and subsequent death (Piano et al., 2002). Taken together, these observations indicate that the SURF6 nucleolar protein is a vital eukaryote protein.

BIOLOGICAL ROLE OF SURF6 IN THE EMBRYONIC DEVELOPMENT OF MAMMALS

In contrast to somatic cells, the early development of mammalian embryos is characterized by a period of transcriptional dormancy and the presence of nucleolar progenitors, which are assembled inside the nucleus into mature nucleoli, starting from the two-cell embryo stage. In the preimplantation period of the development of mouse embryos, the formation of the nucleolus does not end until the morula stage. Surf-6 mRNA was identified via polymerase chain reaction (PCR) at the stage of a single-celled embryo, but its amount significantly decreased during the transition from the single-celled stage to the two-celled stage. In addition, it was found that the amount of Surf-6 mRNA increases upon the transition from the two-cell stage to the morula stage. It was also shown that the distribution of SURF6 in mouse embryos is similar, although not identical, to the distribution of other nucleolar proteins, B23 and fibrillarin. The depletion of SURF6 by RNA interference leads to a decrease in the content of 18S mRNA, the arrest of the embryonic development at the morula stage, and their subsequent death. Taken together, these observations suggest that SURF6 is involved in early embryogenesis, not only in fruit flies and nematodes but also in mammals (Romanova et al., 2006).

SURF6 ASSOCIATION WITH PROLIFERATIVE STATUS OF CELLS

It was shown that the highest SURF6 level during the cell cycle is observed in the G2 phase of the cell cycle, and the SURF6 level in cells in the state of proliferative dormancy (G0 phase) is markedly reduced in comparison with the asynchronous culture (Gurchenkov et al., 2005). In an analysis of various types of cells, a particularly high level of SURF6 expression was described in actively proliferating cells—in embryonic cells, progenitors, and hematopoietic stem lines (Ringwald et al., 2012) and in some tumor cells (http://www.proteinatlas.org/ENSG00000148296-SURF6/cancer). Interestingly, SURF6 is absent in mouse spleen lymphocytes, but its content progressively increases after activation to proliferation in vitro by mitogen. In this case, SURF6 expression begins earlier than that of the PCNA and Ki-67 proteins, which are the main markers of proliferation used in cancer diagnostics (Moraleva et al., 2009). A study on the SURF6 content at the immunocytochemical level and immunoblots in human lymphocytes that were normal, resting, or activated to proliferation by mitogen in vitro and in native lymphocytes obtained from patients with lymphoproliferative diseases showed that the SURF6 nucleolar protein is not detected in normal lymphocytes under certain conditions, but it is found in activated lymphocytes, as well as in lymphocytes of patients, with the highest level observed in patients diagnosed with lymphoma of the mantle zone. Thus, SURF6 can serve as a new activation marker for lymphocytes (Malysheva et al., 2010a, 2010b; Moraleva et al., 2020).

The disordered primary structure and high positive charge of the SURF6 molecule facilitate its interaction with NPM1, which, according to some authors, is a key condition for the formation and functioning of the nucleolus and its individual subdomains (Smetana, 2002; Ferrolino et al., 2018). This may be the cause for the change in the SURF6 content during activation, blast transformation, and cellular dedifferentiation.

Based on the screening of large-scale colony formation via cDNA transfection, SURF6 was identified as a protein that is presumably associated with tumorigenesis in cultured mouse fibroblasts and human tumor cells (Wan et al., 2004). It was shown that SURF6 overexpression in mouse fibroblasts has no noticeable cytotoxicity, but it significantly accelerates proliferation by reducing the G1/S transition time (Moraleva et al., 2017). These data support the previously stated hypothesis that mammalian SURF6 is a putative oncoprotein (Wan et al., 2004).

CONCLUSIONS

Summarizing the literature data, we can conclude that the SURF6 protein of mammalian nucleoli is a protein that is necessary for the maintenance of cells in the viable state. Like the yeast homolog Rrp14, it participates in ribosome biogenesis and contributes to the stabilization of the intragenic transcribed spacer ITS2 and, possibly, 5'ETS. Analysis of the available literature data also suggests the involvement of the SURF6 nucleolar protein, not only in ribosome biogenesis but also in the regulation of cell proliferation and differentiation. However, direct data on the functional significance of the SURF6 protein in mammalian cells are still absent.