Introduction

In multicellular metazoa, the specification and maintenance of the germline are of paramount importance for the propagation of species. Upon reunion of the parental genomes via fertilization, the germline can regenerate a whole new body with all types of cells and organs, and therefore the germline is considered as ultimate totipotent stem cell [1]. In most species, the capability of germline specification and development is achieved through the highly conserved “germ granules”—the electron-dense, membraneless ribonucleoprotein (RNP) particles, which were given with different names as initially discovered, such as the P bodies in worms, the pole plasm in flies, or nuage (e.g., intermitochondrial cement, chromatoid body) in mammals [2, 3]. Germ granules were initially posited by Weismann, et al., who speculated that the “dark material”, germ cell determinant, is present to carry the hereditary information from generation to generation and, when destroyed, is unable to initiate the germline development [3,4,5]. After that, mounting evidence has revealed the morphological character and RNA/protein contents of these granules in orchestrating the germline development and maintenance in a wide range of metazoan species. In worms, P granules are distributed uniformly in the cytoplasm of the fertilized, unpolarized one-cell embryo. After symmetry breaking, the P granules become visualized only in the posterior half of the embryo, followed by the embryo dividing that gives rise to a progenitor germ cell containing P granules as well as a somatic, non-P granule-containing daughter cell [6,7,8]. Previous evidence proposed two successive processes underlying the establishment of the germ granules: (i) the initial assembly of P granules when migrating to the posterior end by cytoplasmic gradient flow; (ii) followed by the degradation or disassembly of anterior P-granule contents [9,10,11]. Nevertheless, the physical nature as for how the granules are dynamically assembled and disassembled in response to environmental stimuli signaling and developmental cues has only been uncovered until recently by Hyman’s lab [8]. They found that the P granules in Caenorhabditis elegans behave like liquid droplets, which exhibit a highly dynamic morphological transition between soluble form and condensed form via a process known as phase transition (or phase separation). It is referred to as the demixing of contents into two distinct liquid phases in a homogeneous liquid system, and has been well studied in polymer chemistry. This study, perhaps for the first time, attributed the physical nature of liquid droplets to the formation of a non-membrane subcellular structure (germ granules) in cells, thereby linking the phase separation to cell biology in biomedical research [8, 12]. Since then, the field has boomed by intensive studies revealing that many membraneless, nuclear organelles, such as nucleoli, Cajal bodies, and cytoplasmic bodies, e.g., stress granules and Processing bodies (P bodies), hold similar biophysical features as seen in liquid droplets [11, 13, 14]. They are structurally dynamic, size-dependent, labile to shifts in pH, temperature, or ionic strength, and subject to the regulation of post-translational modifications (e.g., arginine methylation). These organelles are characterized by liquid-like behaviors, e.g., dripping, fusion and wetting, and are assembled as a result of liquid–liquid demixing into two-phase systems, also known as liquid–liquid phase separation (LLPS) or coacervation [15,16,17,18]. As a complement to the traditional membrane-encapsulated organelles, these membraneless particles are extremely enriched in proteins and nucleic acids with a high density, and thus were recently given the name “biomolecular condensates”. It is worth noting that LLPS significantly impacts the distribution of protein components, with proteins being significantly enriched by 10–300-fold in the coacervate compared to those in the dilute phase [13, 16].

Recent advances have revealed one striking feature that characterizes protein constituents in the membraneless organelles—the presence of abundant intrinsically disordered proteins (IDPs) or hybrid proteins with both ordered domains and intrinsically disordered protein regions (IDPRs) [18, 19]. They prove to be crucial constituents for phase separation. Perhaps one of the best-characterized IDPs is the Vasa (Vas), also known as Mvh (mouse Vasa homolog) or Ddx4 (DEAD-box polypeptide 4) (hereafter we use Ddx4 in mammals), which has been widely studied as a highly conserved, germ cell-specific biomarker necessary for germline specification and maintenance (Fig. 1A) [20,21,22,23,24,25]. Vasa was first identified as a germline factor following a large-scale genetic screen for maternal-effect recessive lethal mutations in the widely employed genetic model organism—Drosophila over 30 years ago, and was subsequently cloned and shown to be an essential component of polar granule in Drosophila [25, 26]. Thereafter, Vasa homologs have been intensively characterized as a highly conserved germline marker in a myriad of metazoan species. Those extensive studies have illuminated the multifaceted functions of Vasa related to many aspects of RNA metabolism-dependent processes, such as piRNA processing and ribonucleoprotein (RNP)-regulated granule formation, in germline development. Interestingly, recent findings suggest that Vasa/Ddx4 is also expressed in other tissues beyond the germline, such as the stem cells (e.g., neoblasts in planarian) and tumors. In turn, this might justify the fact that germ cells are deemed as ultimate totipotent stem cells, and germline-specific transcriptional machinery is activated in tumor cells (so-called cancer-testis genes) [23, 27, 28]. While extensive studies have been performed and excellent reviews have discussed the versatile roles of Vasa/Ddx4 in the germline and in other tissues among diverse species [22, 23], the exact functions of Vasa/Ddx4 are still far from clear. In this review, we mainly summarize and assess the roles of Vasa/Ddx4 by reconciling its structural domains, including the unstructured tails and the helicase motifs, with phase separation, granule formation and piRNA biogenesis throughout the germline development.

Fig. 1
figure 1

Vasa orthologs are evolutionarily conserved and harbor intrinsically disordered regions at both the N-terminus and C-terminus. A The phylogenetic tree of Vasa orthologs from 12 animal species with schematic domain composition illustrated on the right side. Estimated divergence times are obtained from the TimeTree database [102]. B In silico prediction of protein disorder tendency for sequences of Vasa orthologs from mouse (blue), human (red), Drosophila melanogaster (black), and Zebrafish (yellow). Residues above the dotted line are predicted to be disordered by IUPred2A [103]. Conservative regions (DEAD-box helicase domain) are indicated by colored rectangles below. C Multi-sequence alignment of amino acids for Vasa orthologs across 12 metazoan species using Jalview [104]. Amino acids highlighted in red color represent conserved motifs in DEAD-box helicase family (Motifs I–VI) with names labeled, respectively, below. The green color highlights the divergent acidic C-terminal sequence across 12 species

Vasa/Ddx4 is highly conserved across metazoan species in the DEAD-box helicase family

Vasa orthologs are highly evolutionarily conserved and are present in most metazoan species (Fig. 1A, C). Vasa or its mammalian ortholog, Ddx4, harbors consensus DEAD-box helicase motifs, and thus belongs to the DEAD-box family. The DEAD-box protein family was initially defined by the identification of a handful of conserved sequence motifs when comparing a number of evolutionarily conserved RNA metabolism-related proteins, including Vasa, eIF4a, p68, and PL10 et al. [29, 30]. Alignment of amino acid sequences typically revealed at least nine conservative sequence motifs, including motifs Q, I, Ia, Ib, II (DEAD), III, IV, IVa, V and VI, that exhibit minuscule phylogenetic variation across metazoan species as illustrated in Fig. 1C. In general, the helicase core of the DEAD-box family protein is comprised of two recombinase A (RecA)-like domains—the N-terminal domain (NTD) and the C-terminal domain (CTD)—which closely interact with each other to render the bound single- or double-stranded RNA sharply bent through ATP binding and hydrolysis, leading to the disruption of the RNA duplex. As such, the DEAD-box helicase family members can alter (melt) the local secondary RNA structures of the bound substrate RNAs in an ATP-dependent manner. This process is required in a broad range of RNA metabolic processes [31]. Motif II, also known as the Walker B motif, encompasses the canonical core Asp-Glu-Ala-Asp (D-E-A-D) residues that endorse the DEAD name to this family. Motif Q, Motif I (Walker A motif), Motif II and Motif VI are involved in ATP binding and hydrolysis. In comparison, Motifs Ia, Ib, IV and V are composed of consensus residues that interact with the sugar-phosphate backbone of substrate RNAs, as revealed by the crystal structure of Vasa (Fig. 1B, C) [32, 33]. The remaining motifs coordinate the domain interplay that connects RNA unwinding to ATP binding and hydrolysis. It is, however, worth noting that these motifs are not individually dedicated to one function, but often participate in other interactions accompanied by the conformational change of the Vasa/RNA complex [31, 33].

Vasa/Ddx4 is the founding member of the DEAD-box family that hosts the largest number of nucleic acid helicases encoded in metazoa. It harbors all the typical helicase motifs which can change the susceptibility of substrate RNAs to nuclease digestion. However, as is commonly observed for most other DEAD-box family proteins, Vasa/Ddx4 has divergent tails at both N-terminus and C-terminus flanking the canonical helicase domain. The amino acid sequences at both ends significantly vary from Drosophila to higher vertebrates (Fig. 1C). Noteworthily, the sequence diversity at the N-terminus is even present within a closely related species, such as between D. melanogaster and D. simulans [34]. Growing evidence shows that the tail sequences are the intrinsically disordered regions (IDRs), which drive phase separation and endow the Vasa/Ddx4 protein with versatile functions fundamental to germline development as discussed below [29, 30, 35].

Ddx4-mediated nuage formation resulting from phase separation: the tails tell the tale

As aforementioned, while the two tandem RecA-like helicase domains (homologous to the bacterial RecA domain) are highly conserved among members of the DEAD-box family, they typically harbor the variable low-complexity amino acids at their tails, namely intrinsically disordered regions (IDRs) (Fig. 1B, C), with longer N-terminus and shorter C-terminus. IDRs were originally defined by the presence of a disordered stretch of at least 30 amino acids [36], which are often highly enriched in disorder-promoting residues, such as Arg, Gly, Lys, Ser and Pro, but depleted of order-promoting residues, including Leu, Trp, Tyr, Cys, Asn, IIe and Val. Although it has been long known that the function of a protein is dependent on its primary sequence-based 3D structure, growing evidence reveals that abundant proteins play roles without a stable structure in vivo [37]. For instance, the intrinsically disordered proteins (IDPs) significantly differ from those structured proteins or domains in their IDR sequences in many aspects, e.g., low complexity, repetitive sequences (RG, YG, QN et al.), hydrophobicity, charge, and flexibility [38]. IDPs are highly enriched in biomolecular condensates and proven to initiate phase transition both in vitro and in cellulo [18, 38]. While IDPs usually lack tertiary residual contacts, they can bind to a vast number of protein or nucleic acids partners, through short sequence motifs and low-complexity sequences. In particular, these IDPs-interacting partners could significantly differ under different cellular settings.

The IDR sequences have been intensively explored to serve as the primary driving force for the LLPS, a phenomenon of demixing of a liquid phase with another, resulting in the formation of biomolecular condensates, also known as the liquid droplets or membraneless organelles. They are typically micron-sized coacervates, predominantly consisting of ribonucleoprotein (RNP) assemblies that exclude the bulk aqueous phase. These liquid droplets have been extensively studied in different settings, such as the P-bodies, stress granules, Cajal bodies, nucleoli, and so on [11, 39]. It is believed that these membraneless RNP granules, like membrane-encapsulated organelles, provide an alternative local solvent milieu that allows specialized biochemical reactions, but in the meantime, harbor additional advantage that permits an efficient exchange of constituents with surrounding nucleoplasm or cytoplasm. In addition, the absence of encapsulated membrane enables them to quickly respond to external environmental stimuli via rapid assembly and disassembly [11, 14, 15, 40]. On the one hand, the low-complexity sequence in the IDPs is necessary and sufficient to drive phase transition for assembly of liquid droplets independent of the RNA molecules; Proteomic analyses revealed abundant RNA-binding proteins with low-complexity sequences, such as FUS, NANOS, Pumilio et al. enriched in the RNP granules, irrespective of the cell types or tissue origins, possibly implying synergistic enhancement during phase separation among the droplets-contained IDPs [19, 38, 40].

Germ granules are such a typical form of liquid droplets that differ with respect to morphology and RNP contents among diverse species or at varied developmental stages in germline as aforementioned. Ddx4 probably exemplified one of the best-characterized IDPs driving the assembly of germ granules. Nott et al. validated that Ddx4YFP, wherein the conserved DEAD-box cassette was substituted by the YFP sequence, inherently condensed to form liquid-like organelles in the nuclei of HeLa cells [41]. These liquid-like droplets display rapid assembly and disassembly in response to changing surrounding milieus, such as the temperature, salt concentration or tonicity. Furthermore, the disordered sequences of Ddx4 could reversibly form granular structures in vitro, resembling those formed within cells, and the FRAP experiments further demonstrated the similar dynamic internal architecture, indicating that the N-terminal intrinsically disordered region of Ddx4 is the sequence determinant that is sufficient and indispensable to drive the phase separation [41]. Further experiments showed that the droplet stability is primarily maintained by the electrostatic interactions among the interior protein molecules, but could be attenuated by the post-translational methylations via the protein arginine methyltransferase family, e.g., asymmetric dimethylarginine (ADMA) in the six predicted methylation sites at N-terminus of Ddx4 [35, 41]. In agreement with this, a noteworthy feature in the disordered N-terminal tail of Ddx4 lies in that the charged residues are distributed into clustered blocks of net positive and negative charge, consisting of 8–10 residues in length for each block along with pair repeats of FG or RG held in close vicinity [35, 38, 41]. When these residues were swapped to scramble the blocks while simultaneously maintaining the same net charge overall, the newly assembled protein failed to form organelle-like structures in vitro or in cells, indicative of an important role of charge patterning in droplet assembly. Intriguingly, the similar occurrences of short, repeated peptide motifs, such as RG/YG, FG and QN, have been observed to facilitate the assembly of FUS proteins, nuclear pore complex, and P bodies, respectively, suggesting that multivalent interactions, likely through cation-pi interactions, could drive the self-assembly of disordered sequences [6, 18, 42].

Sexually dimorphic functions of Vasa/Ddx4 in metazoa: Vasa/Ddx4 knockout led to gender-specific sterility phenotype in Drosophila and mice

Vas deletion causes female sterility and defective embryonic patterning in Drosophila. In Drosophila, Vas is a maternally derived factor that localizes into two functionally distinct compartments—the pole plasm in the oocytes and the nuage in nurse cells—inside the egg chamber (Fig. 2A). The active transcription and protein translation in the nurse cells produce enough RNAs and proteins, which are subsequently transported to the neighboring oocytes and are essential for oocyte development and embryonic patterning. In the fertilized oocyte, it has been known for decades that the short form of Oskar, short Oskar, is transported via the cytoskeletal microtubules to the posterior tip of the oocyte, and assembles a functional subcellular structure, namely pole plasm, which determines the embryonic anterior–posterior axis and the specification of future primordial germ cells (PGC) [43]. Vasa is specifically recruited to the polar plasm in part through interaction with the C-terminal extended LOTUS domain (eLOTUS) of short Oskar. However, the fact that the long Oskar, which has both LOTUS and OSK domains as in short Oskar, prevents the interaction between Oskar and Vasa, suggest that the N-terminal extension of the long Oskar regulates the interaction between Vasa and Oskar [44]. Noteworthily, the LOTUS domain is highly conserved from bacteria to vertebrates, and was originally defined by its presence in two members of the TUDOR protein family—TDRD5 and TDRD7—in mammals (Table 2) [44, 45]. In the nurse cells of Drosophila, their orthologs Tejas and Tapas are critical for RNA-independent Vasa localization to the nuage and are involved in piRNA-mediated retrotransposon silencing pathway, although the Tudor domains themselves recognize and interact with arginine-methylated, RGG-enriched motifs in the N-terminal tail of Vasa [46] (Table 2). These evidence together suggest that Vasa serves as an upstream regulator in the assemblies of germ plasm in oocytes and nuage in nurse cells.

Fig. 2
figure 2

Schematic diagrams of germline development and dynamic distribution of nuage in Drosophila and mice. A Anatomy of the ovaries in a single ovariole illustrating nuage formation during Drosophila oogenesis. The Drosophila ovaries comprise 16–20 tubular structures, termed ovarioles, which are arranged as a production line-like assembly of differentiating egg chambers to produce mature eggs (oogenesis). The functional unit of the Drosophila ovary is termed the ovarian follicle, or egg chamber, originating from the germarium that is localized to the anterior tip in each ovariole. Germline stem cells (GSCs) (red) undergo asymmetric cell division, giving rise to one daughter stem cell that remains in contact with the Cap cells, and the other daughter cell destined to differentiate into Cystoblast (CB). Following four rounds of successively synchronous mitotic cell divisions, 16 interconnected cystocytes (light red) are produced, and subsequently bud off the germarium once they are surrounded by somatic follicle cells. During the development of each egg chamber, one of the 16 cystocytes is committed to meiosis and divides into the oocyte at the posterior end, whereas the remaining cells develop into polyploid nurse cells, which are interconnected through the intercellular bridges. Vasa-enriching nuage in nurse cells and Vasa-containing polar granules in oocyte are highlighted in green color. S, stage; B schematic diagram of nuage distribution (green color) in the male mouse germline. Primordial germ cells (PGCs) are induced in a subpopulation of the epiblast in the early embryos starting on embryonic day 6.5 (E6.5). Ddx4 is detectable in PGCs from E10 onwards. Pi-bodies and piP-bodies are present in prospermatogonia in the perinatal testis. Intermitochondrial cement (IMC) is present in the spermatocytes. The chromatoid body (CB) is assembled in haploid spermatids. All granules (green) are localized around the nuclear membrane within the cytoplasm

The first complete Vas-null Drosophila model, namely VasPH165, wherein a 7343-bp deletion removed the entire coding region of Vas, was established in 1998 by Paul Lasko’s laboratory [47] (Table 1). In the female Vas-null Drosophila, a series of defects were observed in both the nurse cells and the oocytes, such as reduced numbers of germline stem cells and developing cysts in the germarium, the tumorous or degenerated egg chambers starting from stage 6, the absence of the oocytes, or undifferentiated oocytes with diffuse nucleoli. This phenomenon could be linked to the mis-regulated localization and/or translation of a handful of oocyte-specific mRNAs, suggesting that Vasa is a master coordinator in regulating Drosophila female germline development [26, 43, 47]. Interestingly, Vas-null males do not exhibit any defects in the male germline, implying a sex-specific role of Vasa in germline development. Thereafter, extensive studies have been designed trying to decipher the specific function of different segments of Vasa protein in Drosophila oogenesis, including the rapidly evolving N-terminal tail, the central RecA-like helicase core, and the acidic carboxyl tail. In general, as summarized in Table 1, the diverse phenotypes in individual Vasa mutants are closely correlated with the extent to which the Vasa mutations caused the deleterious effects. For instance, deletion of the N-terminal tail region (Vas∆3–200) led to defects in Grk translation, Vasa localization as well as transposon control. Mutations in the central helicase domain, even with a single residue substitution, caused a spectrum of phenotypes, including defective oogenesis owing to aberrant Vasa localization and transposon derepression [47,48,49]. The tail sequences of Vasa protein at both ends are highly divergent across phyla (Fig. 1C). Intriguingly, exclusive deletion of the extreme C-terminal end with seven acidic residues (Fig. 1C) abrogated the pole cell specification and translational activation of Grk, and derepressed the transposons, suggesting an indispensable role of the tail sequence in the germline [48, 49].

Table 1 Phenotypes with mutant Vasa homologs in Drosophila and in mice

Ddx4 knockout led to male infertility. In the mouse germline, the expression of Ddx4 is detectable starting at embryonic day (E) 10.5–11.5 gonads, just right before the sex is determined [20, 21]. From then on, its protein expression is maintained in both sexes at high levels until post-meiotic spermatids in males, and the primary oocytes in the female mice. In Ddx4 knockout mice, wherein the exons 9 and 10 were deleted causing a null mutation, the proliferative activity of PGC was markedly reduced [21, 50]. After birth, the Ddx4-null spermatogonia continued differentiation and meiotic division, but failed to bypass the zygotene stage, and instead underwent apoptosis. However, in contrast to the Vasa-null Drosophila, the female Ddx4-null mice are completely fertile without any morphological defects in the oocytes [21]. Pillai’s group generated a point-mutation knockin mouse model, Ddx4KI, where the consensus ATPase motif (DEAD→DQAD) was mutated, thereby abolishing the RNA helicase activity. Interestingly, the Ddx4+/KI mice displayed male-specific sterility though both the wildtype and mutant Ddx4DQAD proteins are co-expressed, suggesting a dominant-negative effect for this ATPase-deficient mutant [51]. The spermatogenesis took place efficiently but uniformly arrested at the stage of round spermatids, although it is currently not clear to what extent the function of haploid spermatids was impaired in the Ddx4+/KI mice. However, in contrast to Ddx4-null males, no adverse effect on the piRNA biogenesis pathway was observed in the Ddx4+/KI testis, which was corroborated by the repression of TEs as well as the normal association of Mili- and Miwi-bound piRNA profiles. In comparison, the single mutation in the DEAD motif abolished the Ddx4 expression or destabilized the Ddx4 protein in the Ddx4−/KI mice since Ddx4 is not detected in the Ddx4−/KI mice, leading to similar phenotypes between Ddx4−/KI mice and Ddx4-null mice. In addition, the retrotransposons were derepressed only in the Ddx4−/KI mice but not in the Ddx4+/KI mice, reminiscent of the helicase-independent functions of Ddx4 in male germline differentiation [51].

Ddx4-containing nuage formation and piRNA-mediated retrotransposon control: one size fits all

DDX4 is a master coordinator of nuage organization in the germline. The RNP aggregates are commonly observed in the cytoplasm of the germ cells throughout their life cycle, and thus were given the name “germ granules”, or “nuage”, due to their cloud-like, amorphous and membraneless morphology under transmission electron microscopy (TEM) [52,53,54] (Fig. 2). In the fetal or perinatal germline in male mice, the perinuclear organelles are vital for the biogenesis of transposable elements (TEs)-derived piwi-interacting RNAs (piRNAs). In general, following transcription by RNA polymerase II and nuclear export to the cytoplasm, there are two proposed processing pathways for piRNA maturation—the primary processing pathway is to cleave the presumably single-stranded precursor piRNA transcripts into thousands of non-overlapping/phased fragments with a strong 5ʹ uridine preference (Mili-bound sense primary piRNAs), while the secondary processing pathway would amplify the piRNA pool of Mili- and Miwi2-bound piRNAs through a so-called “Ping-pong” cycle mechanism [45, 51, 55, 56]. While many details remain elusive concerning what enzymes and how their cofactors are responsible for the primary piRNA processing, it is quite clear that the latter pathway—secondary processing pathway—necessitates the functional integrity of Ddx4-enriching perinuclear nuage, wherein a vast number of PIWI- and Tudor-family protein members dwell in and interplay, as is deduced from the genetically engineered mouse models over the past decades [46, 54, 56, 57]. For instance, genetic deletion of Ddx4 or mutation of the catalytic residue in the DEAD motif led to the altered ultrastructural appearance and the disrupted function of the germ granules as described below [21, 51, 54]. In Drosophila, a single mutation of the amino acid within the N-terminus (e.g., VasHE(R170S)) or in the helicase domain (e.g., VasAS(H520Y)) abrogated the structural integrity of the perinuclear granule particles [58]. While the helicase domain of Vasa/Ddx4 is absolutely essential for executing the substrate RNA unwinding, the disordered tails could maintain the structural architecture of granules alone without the helicase domain, presumably through their interaction with other partners (Table 2) [58]. These genetic evidence suggest that the full-sized Vasa/Ddx4 with both tails and helicase domain is proficient in coordinating the organization and in preserving the functional integrity of the germ granules in both mice and Drosophila. However, it is noteworthy that the secondary “Ping-pong” pathway slightly differs between mice and Drosophila in that different players and cofactors are involved as detailed below.

Table 2 Representative Vasa/DDX4-interacting partners and their interacting domains

Pi-bodies and piP-bodies in fetal perinatal germline. In mouse fetal prospermatogonia, there are at least two proposed types of germ granules related to piRNA biogenesis by electron microscopy and immunofluorescence co-staining (Fig. 2B). The first type is called pi-bodies, which comprise Mili/Tdrd1/Gasz/Mov10l1 responsible for processing of the TEs-derived, newly transcribed, precursor long piRNA transcripts; The second type of germ granule shares partial common components of Processing body (P body), and were, thus, named piP-bodies, which harbor Tdrd9/Miwi2/Mael components in addition to the universal P body members, including Ddx6 and Gw182 [54, 59,60,61,62]. Both types of granules are most often localized in close vicinity, whereby the processed secondary piRNA can circularize between two types of granules for the efficient production of the mature piRNAs in a feed-forward way, a mechanism for piRNA biogenesis known as the “Ping-pong” cycle. Intriguingly, the Ddx4, as a core piRNA biogenesis factor, was shared by both types of granules. At the molecular level, Mili binds the shorter piRNAs (~ 26nt) while Miwi2 associates with TEs-targeting antisense longer piRNAs (~ 29nt) population [56, 63]. Ddx4 co-localizes and associates with members in both types of nuage in the testis; however, it might not directly interact with piRNAs in the fetal germline as non-crosslinked immunoprecipitation failed to recapitulate the piRNAs profile in the fetal testis lysate. In contrast, its direct binding to pachytene piRNAs was detected in adult testes [50, 64]. Therefore, it is tempting to postulate that Ddx4 might enhance the shuttling of the substrate RNAs between the pi-bodies and piP-bodies by remodeling their secondary structures, thereby facilitating the “Ping-pong” amplification-mediated biogenesis of TE-derived piRNAs [54, 62]. In agreement with this idea, genetic mouse models with deletion of these components remarkably phenocopied each other, as evidenced by the meiotic arrest and aberrant activation of TEs due to the defective de novo DNA methylation.

Intermitochondrial cement (IMC) in spermatocytes. In spermatocytes, the germ granules assemble into highly electron-dense particles with sizes ranging between 60 and 90 nm, or as clusters of 60–90 nm strands, but most often appear as a cementing material connecting the neighboring mitochondria; thus, being initially called “intermitochondrial cement (IMC)”, by Eddy et al. [52, 53] (Fig. 2B). Interestingly, while different germline marker proteins, such as Mael and Ddx25, dynamically associate with the granules at varied stages, Ddx4 is most likely a constitutive member who always stays with and orchestrates the particles [60, 65]. The localization of these particles is highly dynamic within the cytoplasm; however, they are in close contact with the nuclear membrane and the mitochondria, implying that an ATP-dependent RNA-unfolding function presumably enforced by RNA helicase, e.g., Ddx4, is crucial for the functional integrity of nuages [65]. Furthermore, the similar localization pattern and constituents for the granules shared by the fetal spermatogonia and the meiotic spermatocytes suggest that IMC is a continuation of the material descendants from the prospermatogonia. Noteworthily, histological immuno-electron microscopy examination showed that IMC particles are even more abundant in the late stages of spermatocytes, i.e., pachytene spermatocytes. Since the nuage in the prospermatogonia, i.e., pi-bodies and piP-bodies, are the sites for processing of the precursor piRNAs from the TEs via “Ping-pong cycle”, along with the fact that the majority of pachytene piRNAs are derived from discrete piRNA clusters, it is highly likely that the major role of IMC in spermatocytes is to execute similar functions in silencing TEs as in prospermatogonia.

Chromatoid body (CB) in round spermatids. Evolutionarily, IMC in the spermatocytes emanates from the pi-bodies present in the perinatal prospermatogonia as validated by immune-electron microscopy. Some IMC granules might depart from mitochondria, and aggregate further with the isolated clusters of strands into a single discernible electron-dense granular “dot”, giving rise to the chromatoid body (CB) in post-meiotic spermatids (Fig. 2B). Previous studies by immunofluorescence staining and immunoprecipitation have revealed abundant protein components in CB, including the members of piRNA pathway, such as Ddx4, Mael, PIWI family (Miwi, Mili) [57, 60, 66,67,68], Tudor family (Tdrd1, Tdrd6, Tdrd7) et al. [45, 69, 70]. CB also harbors protein constituents commonly related to RNA processing or metabolism, e.g., Upf1 and Upf2, which are involved in Non-sense-mediated mRNA Decay (NMD) (Table 2) [69]. The convergence of the Tudor-piRNA pathway and NMD pathway in the CB structure reinforces the functional significance of CB in piRNA and mRNA metabolism. However, unlike the piRNAs in the fetal gonocytes, of which the majority are transcribed from large discrete piRNA clusters from TE loci in the genome, ~ 80% of piRNAs in the pachytene spermatocytes and the spermatids stem from the top 50 unannotated intergenic genomic regions termed as “pachytene piRNA clusters”, whose functions are not yet clearly understood [71]. On the other hand, as discussed above, germ cell differentiation can proceed to round spermatids in the Ddx4+/KI testis, whereas meiotic arrest at the zygotene stage occurred in the Ddx4−/KI spermatocytes. The finding that the loss of repeat piRNAs and the presence of MILI-sliced 51-mer pre-piRNA intermediates in the Ddx4−/KI spermatocytes is reminiscent of the functional integrity of the pi-bodies where Ddx4 dwells in together with Mili-Tdrd1, rather than piP-bodies. In the CB of Ddx4+/KI round spermatids, the catalytic-dead Ddx4 remained complexed with the WT Ddx4 and other well-known components (Miwi, Mili and Tdrd1) in the CB. By comparison, immunoprecipitation uncovered that those components alongside the 24–30nt piRNAs are more enriched, implicating that catalytic-dead Ddx4 lost the dynamic remodeling activity of its RNA substrates in vivo, and presumably entrapped the piRNA machinery, leading to the destructive disengagement of its substrates [51, 72]. In addition, a short isoform of Tdrd6 is vastly expressed and physically interacts with Ddx4 in round spermatids, and its deletion led to the diffuse organization of CB with mislocalized Ddx4 and Miwi, whose intensive “singular dot” appearance disappeared [50, 70]. The phenotypes of Tdrd6−/− and Ddx4+/KI testes resemble each other with post-meiotic arrest at the stage of round spermatids. Strikingly, the retrotransposons for LINE1 and IAP are not significantly activated in either Tdrd6−/− or Ddx4+/KI testes, suggesting the essential functions of Ddx4 and CB in regulating the haploid germ cell development independent of TE repression [51, 70, 72, 73]. Another predominant resident of CB is Miwi, which frequently interacts with Ddx4 and, when deleted, showed the spermiogenic arrest at step 4 of round spermatids (Table 2). The RNase activity of Miwi (slicer) conferred by the consensus “DDH” motif is critical to silence the LINE1 transposons independent of “Ping-pong cycle” [74]. The CBs in the round spermatids from either Miwi-null or Miwi slicer-deficient (Miwi+/ADH) testis were fragmented with the LINE1 protein—L1ORF1p—being dramatically upregulated. However, the localizations of most CB constituents appeared not to be affected. Together, these evidences suggest that the Ddx4 abundantly residing in the CB might coordinate and remodel multiple players for both transposon repression and piRNA-independent RNA re-structuring essential for CB-orchestrated haploid spermatid development [56, 74].

Vasa ablation caused nuage loss and derepression of retrotransposons in Drosophila. In Drosophila, the post-meiotic pro-oocyte cell develops into the oocyte with progressive loss of the perinuclear germ granules, whereas the other 15 supporting cells (nurse cells) retain the nuage within the egg chamber (Fig. 2A). Meanwhile, this process is accompanied by the condensing of nuclear karyosomes and the cessation of transcriptional machinery [1, 75, 76]. Thus, in contrast to mammals, the nuage is present in the nurse cells, but not in the germline or soma (i.e., follicle cells), in the female Drosophila (Fig. 2A). Without a functional nuage machinery, it has been found that the follicle cells only encompass the primary piRNA processing pathway, which takes place in another specialized subcellular compartment, namely Yb body [77]. Unlike the majority of pachytene piRNAs generated from large single-stranded piRNA clusters in mice, Drosophila piRNAs are produced from either uni-strand or dual-strand piRNA clusters, such as the 42AB cluster locus in the pericentromeric region of Chr2R. It has been found that the piRNA precursors, once transcribed by RNA Polymerase II, are bound by UAP56, which is localized in the close vicinity of the inner side of the nuclear pores [78]. In the meantime, the perinuclear nuage-enriched Vasa dynamically interacts with Uap56 across the nuclear pores, which facilitates the transport of piRNA precursors from the nuclei to the cytosol for processing into mature piRNAs subsequently. Depletion of Vasa elicited the disassembly and loss of perinuclear granules, and abolished the transposon silencing, but did not impact the expression of protein-coding genes, suggesting a central role of Vasa in nuage-guided transposon control.

The other interesting finding in Drosophila is that there appears to be a tight coupling of nuage to mitochondria during primary piRNA biogenesis. This idea was supported by the colocalization of multiple effector proteins on the surface of mitochondria, such as the Zuc, Armi and Papi [79]. Consistently, their homologs in mice, including MitoPLD, Mov10l1 and Tdrkh, are all present on the outer membrane surface of the mitochondria as well as in the IMC, in mouse germline [59, 62, 80]. However, electron microscopy validated that the perinuclear nuages are not physically close to mitochondria. Intriguingly, genetic deletion of these genes caused similar defects in the primary piRNA processing pathway in both Drosophila and mice, implicating that the mitochondria are indeed pivotal for primary piRNA processing. Then the next question becomes why the mitochondria are so crucial for piRNA biogenesis? Does the mitochondrial membrane surface provide an anchoring site, thereby increasing the local concentration of machinery players for primary piRNA effector protein assembly? This hypothesis seems to be unlikely possible since other similar membrane structures, e.g., endoplasmic reticulum, are also present in each cell. As detailed above, the efficient piRNA processing and movement involve quick RNP remodeling of the nuage, thereby necessitating a large consumption of the ATP energy, as required by the ATPase Vasa/Ddx4 and Armi/Mov10l1 [72]. In this sense, it seems to be plausible to explain why nuages retain close physical contact with the mitochondria, the source of a powerhouse in cells.

Hypothesis: Vasa/Ddx4 fine-tunes the target RNA/protein repertoire in germ granules independent of its sequence specificity

The feature of RNA helicases is their ability to unwind RNA duplexes. We can categorize the RNA helicases into two types based on the catalytic mechanism on the substrates. The first type guides the unwinding of duplex RNA in a canonical ATP-dependent manner by binding and translocating in a 5’-3’ direction along the RNA strand, such as Mov10l1 and Staufen, known as processive RNA helicases [43, 59, 81]. The second type relies on ATP hydrolysis to resolve or melt local secondary RNA structure in a non-processive way to facilitate RNA strands separation or dissociation among RNA/RNA or RNA/protein complexes, including the DEAD-box protein family. As mentioned above, similar to other DEAD-box proteins, the crystal structure revealed that the helicase core region of Vasa tightly folded by the NTD and CTD domains interacts with the backbone phosphate-ribose rather than the base moieties, implying that the RecA-like domains do not bind target RNAs in a sequence-specific manner, but execute its helicase function independent of the RNA sequence, to resolve local short RNA duplexes in the vicinity non-processively [33, 72]. In support of this hypothesis, among 221 candidate mRNAs bound to Vasa identified by Liu et al. in Drosophila embryos, only 24 mRNAs were present along with Vasa in pole cells, and there appeared to be no consensus sequence motifs shared by those mRNAs [82]. Furthermore, the disordered C. elegans protein MEG-3 preferentially binds to ~ 500 mRNAs in a sequence-independent manner with low ribosome coverage within the P granules in vivo, as revealed by individual-nucleotide resolution UV crosslinking and immunoprecipitation (iCLIP) [83]. Indeed, the RNP constituents are highly condensed in the coacervate, therefore, there is a peak demand for ATP-dependent unwinding of the RNA duplex to facilitate the timely communication and interaction between RNA-RNA and RNA-proteins. In this regard, the DEAD-box helicases, including Vasa/Ddx4, appear to be the appropriate and global regulators involved in germ granules resulting from phase separation [30, 35].

At the protein level, Vasa/Ddx4 has been shown to interact with other components physically within germ granules, including Oskar and eIF5B in Drosophila [44, 82], and Ranbp9, Mili and Miwi et al. in mice [50, 56, 68, 84] (Table 2). In most cases, this interaction is likely mediated through the IDRs of Vasa/Ddx4, or occasionally via the full-length residues. Since the IDRs exhibit no stable tertiary structure, it is, thus, tempting to posit that Vasa/Ddx4 has a great capacity to facilitate the unwinding/processing of its diverse RNA substrates through repeated, weak, and transient interactions. Therefore, building a proteomic landscape of molecular players in the granules will unambiguously assist in understanding the mechanisms underlying germline development. Two features characterize the protein architecture in the proteinaceous RNP granules. One feature is typified by RNA-binding domains, such as RNA recognition motif (RRM), K-homology (KH) domain, double-stranded RNA-binding domain (dsRBD), et al. The other is the enrichment of low-complexity sequence capable of driving phase transition. This capability endows the RNA-binding protein with the ability to enrich its RNA targets specifically in the RNP granules [14, 18]. In agreement with this hypothesis, high throughput sequencing revealed, for example, that mRNAs with the Pumilio-bound Nanos response elements are especially retained in the Pumilio-containing RNP granules [85]. The core members of the NMD pathway, including ATPase Upf1 and Upf2, are remarkably concentrated in the CB of round spermatids, and a body of evidences showed that CB is an RNA-processing center that enables highly specialized mRNA decay or selective degradation of longer 3’UTR transcripts restricted to a relatively isolated microreactor organelle [67, 86, 87]. It is, thus, conceivable that Ddx4, which is highly condensed in the CB, must be efficient in charge of timely unwinding and sorting out substrate RNA molecules.

Post-translational regulation of Ddx4 protein

Germline proteins are frequently subject to a combination of post-translational modifications in vivo, and intriguingly, these modifications, in turn, fine-tune the Ddx4 functions by modulating the intrinsic protein–protein or protein-RNA interaction networks throughout germline development. As aforementioned, there are at least six putative consensus arginine methylation sites (RGG), part of which were validated through immunoprecipitation and mass spectrometry analysis at the N-terminus of endogenous Ddx4 protein in mouse testes [64]. However, upon treatment of methyltransferase inhibitor (MTA), there are no apparent changes in the binding intensities between Ddx4 and Tdrd1 or Tdrd6, indicating that the interactions are unlikely mediated through the Tudor-Rme2 module as commonly observed between Tudor-PIWI interactions [64]. In turn, the methyl marks attenuate the electrostatic interactions, which play a critical function in the assembly of RNP granules as detailed above. In addition, Ddx4 is also acetylated at Lys405, which is localized in the DEAD-box domain, by the acetyltransferase Hat1 and its cofactor p46 in mouse testis [88]. The acetylation of Ddx4 is developmentally regulated that parallels the dynamic colocalization of Hat1 and p46 in the CB. Once acetylated, the Ddx4-bound target, eIF4B mRNAs, are selectively released from the Ddx4-containing RNP complex, followed by the elevated eIF4B protein translation, suggesting a key role of Ddx4 in regulating the germline-specific protein translation/activation [88].

Perspective

The establishment of a germline lineage is of paramount importance in sexually reproducing metazoan species. Currently, there are at least two general mechanisms exploited to specify the germline in the animal kingdom: (i) preformation—as seen in worms and Drosophila, wherein germ granules are maternally partitioned and loaded to specify the PGCs; (ii) inductive, also termed epigenetic as seen in mice, wherein extrinsic signaling factors, e.g., BMP4, induce the cell–cell interactions to stimulate differentiation into PGCs [1]. Irrespective of what mechanisms employed for PGC specification, germ cells host a myriad of highly conserved molecular determinants, such as Dazl, Vasa and Gcna. In mice, Dazl is activated much earlier than Ddx4 in PGCs, and genome-wide CLIP (crosslinking and immunoprecipitation) profiling revealed that Dazl prefers to recognize and bind mRNA 3ʹUTR containing a UGUU motif, thereby promoting their protein translation [89]. Gcna is an ancient germline marker that also harbors the IDR at the N-terminus, and when deleted, caused male sterility resembling the Ddx4 knockout phenotype [90]. Since Ddx4 is a constitutive component of the germ granules, Kotaja’s laboratory performed the immunoprecipitation combined with crosslinking with a Ddx4-specific antibody, and identified a multitude of protein constituents in the CB structure, of which many have proven to be required for male germline development [66, 67, 84, 86, 87, 91]. Meanwhile, they have also verified that abundant piRNAs, mRNAs and uncharacterized long non-coding RNAs are present in the CB [66, 67]. Nevertheless, since the nuage structures frequently change their morphology and their protein/RNA residents, it is, thus, meaningful to track the dynamic compositions that accommodate the specific needs for germline development [6, 83]. What is more, it has been shown that the Ddx4 antibody-captured cells were not bona fide germ cells when using the ovary sample for immunolabeling for FACS sorting [92,93,94,95]. Therefore, it is necessary to exploit an antibody against a common tag, such as HA or Flag, to specifically labels the Ddx4 protein in situ to purify the highly morphologically heterogeneous population of nuages from various stages of germline. In contrast to Dazl, which hosts the well-known RNA-binding domain (RRM) along with the protein–protein interaction domain (DAZ), Ddx4 only harbors the DEAD-box helicase domain flanked by the unstructured IDPR tails at both ends. We, thus, can envision that Ddx4 might not bind specific target RNAs directly. Rather, it works as a core coordinator that facilitates the timely unfolding of duplex RNAs, and proper interactions among RNA/protein constituents within the over-crowded milieu of germ granules through its inherent helicase activity [82]. Further deciphering the dynamic interaction landscape of the RNA and protein components within the germ granules throughout germline development will undoubtedly shed light on our understanding of the mechanisms underlying germline specification and maintenance.