INTRODUCTION

For over 100 years, the fruit fly Drosophila melanogaster has successfully served as a universal model in various genetic studies. During this time, a series of significant discoveries were made on Drosophila concerning gene structure, genetic linkage, mechanisms of mutagenesis and recombination, genetic instability, and microevolutionary processes in populations. Drosophila as a model helped to make the most important fundamental discoveries in the field of developmental biology: the basic conservative genetic mechanisms governing the stages of individual development were deciphered.

For a long time, the classical approach was used to search for genes that control development: the induction of mutations using chemical or radiation mutagenesis and analysis of the mutant phenotype by hybridological methods followed by thorough genetic mapping of genes (Riggleman et al., 1989). Approximately 20 years ago, the genome of D. melanogaster was completely sequenced and annotated. This made it possible to use the reverse genetics strategy in the genetic analysis of the development of Drosophila. One of the main approaches to the study of gene function by reverse genetics is its directed inactivation with the subsequent study of the mutant phenotype. In the 2000s, a Drosophila-gene inactivation system was proposed based on homologous recombination (Rong et al., 2000). Later, methods for gene inactivation using transposon mutagenesis systems were improved (Nagarkar-Jaiswal et al., 2015), and the CRISPR-Cas technique for inactivation and editing genes (Ewen-Campen et al., 2017) was introduced. The complete knockout of genes that control ontogenesis is often accompanied by a lethal phenotype, which complicates the study of the functions of such genes. The problem can be overcome by systems for gene inactivation, which can be carried out in specific tissues or even in specific individual cells (Theodosiou et al., 1998; Lee and Luo, 2001; Ryder and Russell, 2003). Systems have also been developed to study tissue- and age-specific expression of individual genes (McGuire et al., 2004).

Currently, D. melanogaster is one of the most studied species of living organisms. Owing to a huge arsenal of methods that make it quite easy to manipulate its genome, Drosophila is one of the most powerful biological models. The review will consider the main modern methods of studying the expression and function of genes in Drosophila.

TRANSPOSON MUTAGENESIS

In the 1980s, Rubin and Spradling developed a transposon mutagenesis technique for Drosophila using the P element (Rubin and Spradling, 1982). The application of this technique made it possible to obtain mutants carrying insertions of the transposon at an random location in the genome, including within the genes. The development of the technique was facilitated by the discovery of a phenomenon called hybrid (gonadal) dysgenesis. Hybrid dysgenesis is manifested in offspring in the form of an increased frequency of transposition of mobile elements, which is accompanied by gene and chromosomal mutations, recombination in males, and sterility of hybrids (Kidwell, 1985). Hybrid dysgenesis has been described not only for the P element but also for some other transposons and retrotransposons. Crosses can be disgenic only if the males carry a transpositionally active mobile element but the females do not. This phenomenon is explained today by the fact that females with copies of a certain mobile element in the genome acquire protective mechanisms based on piRNA interference and suppress its transposition in the ovarian tissues (Duc et al., 2019).

In the experiments of Rubin and Spradling on the basis of a plasmid vector, a construct containing the P element was obtained in which the 5'- and 3'-terminal repeats necessary for transposase recognition were retained, whereas the central part (including the transposase gene) was replaced with the rosy+ gene. This construct was injected into the early Drosophila embryos together with a helper plasmid expressing transposase or carrying a full-sized P element (Fig. 1). For injection, embryos mutant in the rosy gene were used, the genomes of which had no P element. In this experiment, 8% of the injected embryos developed into fertile imagoes and 39% of them gave birth to offspring with the rosy+ phenotype, i.e., carried a transposon insertion in the genome. Next, mutants were selected according to the phenotype of interest and the localization of the transposon insertion was searched. Thus, an effective method for studying the function of Drosophila genes by selecting mutants after nontargeted transposon mutagenesis was developed.

Fig. 1.
figure 1

Transposon mutagenesis using the P-element. The method is based on coinjection of early embryos with the white genotype constructs with the rosy+ gene flanked by the ends of the P element and the P transposase gene. Chimeric individuals obtained after injection after crossing with some probability give offspring with pink eyes (the transgenic offspring is outlined).

This method was further developed, and the task during the implementation of the Berkeley Drosophila Genome Project (BDGP) was to inactivate each Drosophila gene by introducing a P element. As part of this task, more than 30 000 strains of flies carrying a transposon in different parts of the genome were obtained. More than 6000 strains were selected to replenish the collection of Drosophila strains from the Bloomington Stock Center. In total, approximately 40% of Drosophila genes to date contain P element insertions in the coding or regulatory part of the gene (Bellen et al., 2004).

Similar transposon mutagenesis systems were later developed based on the use of Minos transposons of Dhydei (Loukeris at al., 1995) and piggyBack transposons of lepidopterans (Lobo et al., 1999). Neither of the transposons is found in the D. melanogaster genome, which means that the D. melanogaster strain used for transgenesis will be a priori disgenic for them.

MUTAGENESIS USING THE GAL4/UAS SYSTEM: MANAGING GENE EXPRESSION

The method of transposon mutagenesis has become widespread and has been used not only to inactivate genes but also to control the expression of cloned genes. Transposon mutagenesis using the GAL4/UAS system was first used in a paper by Brand and Perrimon (1993) to study the function of the even-skipped gene involved in the control of segmentation in Drosophila. The system consists of two constructs, one of which contains the gene encoding the yeast transcriptional activator GAL4, and the other contains the studied gene, in the 5'-regulatory part of which the GAL4 binding site, the UAS enhancer (CGG-N11-CCG), is introduced. The constructs work in two different transgenic strains of flies. One strain—the driver—expresses GAL4 under the control of the genomic enhancer, the other strain contains the studied gene under the regulation of UAS (Fig. 2). When these two strains are crossed, the hybrid gene activates the studied gene, which can be performed in all cells of the body (for example, by exposure to elevated temperature if GAL4 expression is controlled by the heat shock gene promoter) or can be tissue-specific (if GAL4 is expressed in a certain type of cells or tissue under a tissue-specific promoter). It was shown that the product of the yeast gene GAL4 does not significantly affect the phenotype of flies. To date, a collection of strains that express GAL4 in different tissues has been obtained; these strains are called GAL4 strains. These include GMR-GAL4 (expression in postmitotic cells of the eye), CG7077-GAL4 (expression in pigment cells), sNPF-GAL4 (expression in cells of the central nervous system), elav-GAL4 (expression in brain neurons), e22c-GAL4 (expression in follicular stem cells), etc. (for a more complete list, see http://flystocks. bio.indiana, http://flybase.org/).

Fig. 2.
figure 2

Method for controlling gene expression using the GAL4/UAS system. When the strains are crossed, one of which contains the studied gene X in the genome under the control of the UAS yeast enhancer, and the other contains GAL4 under the control of the genomic enhancer, transcription of the studied gene is activated in hybrids. Instead of the X gene sequence, a construct can be integrated into the UAS genome to suppress X gene expression by RNA interference.

The use of the GAL4/UAS system opens up wide prospects for controlling gene expression, since it allows one to selectively (cell or tissue specific) activate or suppress transcription of the studied gene. The latter is possible when using helper strain expressing the GAL4 repressor, GAL80.

MUTAGENESIS USING SITE-SPECIFIC RECOMBINASES: GENE AND PROTEIN TRAPS

A breakthrough stage in the reverse genetics of Drosophila was the development of a method for site-specific integration of a given sequence into the genome. This technique was originally intended for the mouse genome (Branda and Dymecki, 2004). For Drosophila, the systems Flp-FRT (Golic, K.G. and Golic, M.M., 1996), PhiC31 (Groth et al., 2004), and Cre-Lox (Nakazawa et al., 2012) were adapted.

The Cre-Lox recombination system of bacteriophage P1 consists of the Cre recombinase enzyme, which recognizes two short target sequences, LoxP, and recombines between them. The Flp-FRT recombination system from the 2-micron Saccharomyces cerevisiae yeast plasmid is similar to Cre-Lox, and includes Flp recombinase (flippase), which performs recombination between target sites, FRT. Based on these two systems, transgenic Drosophila strains carrying the recombinase gene and recognition sites were obtained (Fig. 3).

Fig. 3.
figure 3

One possible approach to controlling gene expression using the Flp/FRT recombination system. In this example, the Flp flipase is controlled by the heat shock gene promoter, the induction of Flp expression leads to recombination at the FRT sites, which activates the expression of the GAL4 gene, which is controlled by the actin gene promoter. The GAL4 product, in turn, triggers the expression of the fluorescent protein gene and/or the studied gene X.

A comparative analysis of the efficiency of gene knockout using Flp and Cre recombinases in D. melanogaster was performed in a paper by Frickenhaus et al. (2015). The authors used the GAL4/UAS-Flp and GAL4/UAS-Cre systems for specific expression of the corresponding recombinases in neurons and muscles in order to inactivate the cabeza gene. They concluded that, as a knockout tool, Flp recombinase is more effective than the Cre recombinase, which is associated with insufficient expression of Cre in the studied cells. In addition, the authors found the toxicity of the Cre protein for Drosophila, which is not observed when using the Flp protein.

The Flp-FRT recombination system was used to obtain chromosomal rearrangements in Drosophila. The efforts of the consortium projects DrosDel (Bloomington Drosophila Stock Center) and Exelixis were aimed at obtaining deletion mutations in several thousand genes. To obtain rearrangements, a collection of strains was used carrying insertions of FRT sites, between which mass crossings were performed. Thus, during the implementation of the DrosDel project, a library of deletion mutations was obtained that together cover approximately 80% of the genome (Ryder et al., 2007). In total, during the implementation of the DrosDel and Exelixis projects, more than 500 000 deletions were received, ranging in size from 1 bp to 1 million bp.

The phiC31 phage recombination system has proven to be a particularly useful tool for obtaining strains of transgenic flies, since it is the most efficient for inserting different transgenic sequences into the same site in the genome (Groth et al., 2004; Bischof et al., 2007). PhiC31 encodes an integrase that provides recombination between attP and attB sites. Upon recombination between the attP and attB sites, hybrid attL and attR sites are formed that are not recognized by the enzyme. To date, a set of strains containing integrase landing sites throughout the genome and available at various collection centers has been obtained (Knapp et al., 2015). Recently, a mutant integrase phiC31 has been obtained, which is capable of not only integrating but also excising (cutting) at recognition sites, which is useful for obtaining combinations of various transgenes within a single gene.

Using phiC31 recombination, a recombinase-mediated cassette exchange (RMCE) method was developed (Bateman et al., 2006). Using this approach, the genomic landing site containing the marker gene flanked by attP sites can be replaced by any other DNA sequence through a plasmid containing the gene of interest flanked by attB sites (Fig. 4a). It is important to note that this technology allows integrating even unmarked constructs into the Drosophila genome, i.e., even those that lack functional genes and contain regulatory or utility sequences (e.g., multiple cloning sites).

Fig. 4.
figure 4

Methods for controlling gene expression using the PhiC31 recombination system. (a) Gene X can be integrated into the genome into a specific site of which PhiC31 recognition sites, attP, are preintegrated. To start the recombination process, a system of crossing strains of flies carrying the PhiC31 recombinase gene and its recognition sites and a transgene donor plasmid are used. (b) Scheme of substitution of the mini-white gene with the yellow gene using a donor plasmid—the RCME method. (c) MiMIC system. The construct consists of two inverted repeats of the Minos transposon (L and R), two inverted attP PhiC31 (P) sites, a gene trap cassette consisting of an acceptor splice site (SA) followed by stop codons in three reading frames (red circle), the GFP gene with a polyadenylation signal (pA), and the yellow+ gene. The sequence between attP sites can be replaced via RMCE, resulting in two attR hybrid sites being formed. A donor plasmid for RMCE can be a plasmid consisting of a polylinker site for cloning, a plasmid with an effector gene (for example, GAL4) fused to SA, or a plasmid-protein trap consisting of a reporter (for example, GFP) flanked by SA and a donor splice site (SD).

One of the most useful and flexible strategies based on transposons and the RCME system is the MiMIC (Minos-mediated integration cassette) system (Venken et al., 2011). The MiMIC construct contains the Minos transposon flanked by two inverted recombinase phiC31 recognition sites, attP, into which cassettes with a gene trap—the GFP (green fluorescent protein) gene—and the yellow+ selective marker are inserted, and the splice acceptor site and stop codons in three reading frames are located directly in front of them. The attP sites allow replacing the internal transposon sequence with any other sequence via RMCE (Fig. 4b). Insertion of the MiMIC construct in the correct orientation into the intron of the coding gene will facilitate translation of the truncated protein due to the presence of a splice acceptor and stop codons, thus the insertion will act as a gene trap. The uniqueness of the MiMIC system is the ability to introduce regulatory genes, such as GAL4 or Flp, and functional reporters, such as GFP, into the sequences of the studied gene (Fig. 4c).

In the paper by Venken et al. (2011), a collection of more than 6000 insertions of MiMIC into regulatory sequences and gene introns was obtained. Approximately 2000 genes currently have MiMIC inserts in introns, but using CRISPR technology (see below) to introduce MiMIC inserts into the genome is supposed to help greatly expand the capabilities of the method.

The protein-trapping method is based on the use of MiMIC constructs that carry a sequence of fluorescent protein flanked by SA and DS. If such a construct is built into the intron, the reporter (usually the GFP gene) falls into the same reading frame with the “captured” gene (Fig. 4c). This approach has been successfully used in a number of model organisms, including Drosophila, for which collections of strains of flies expressing GFP as part of the MiMIC construct built into the introns of different genes have been created.

GFP traps are mainly used to study the expression patterns of captured genes or the cellular localization of their protein products. The GFP trap can also be used to suppress, through RNA interference, gene transcription fused in the same frame with GFP. This method is called tag-mediated loss-of-function, it eliminates the main disadvantages of the classic RNA interference knockdown approach, in which gene-specific sequences are targets for small RNAs. In the work by Neumüller et al. (2012), the maternal effect of several genes (Spt6, Cp1, Pabp2, and par-6) in embryogenesis was studied by tissue-specific shutdown by the above-described method of gene transcription in germ line cells.

RNA INTERFERENCE: GENE KNOCKDOWN

RNA interference (RNAi) is an endogenous cellular mechanism triggered by double-stranded RNA (dsRNA), which leads to degradation of homologous RNA and suppression of gene expression at the posttranscriptional level (Ameres and Zamore, 2013). The RNAi mechanism was first discovered in Caenorhabditis elegans but was then found in the cells of many eukaryotes: in animals, plants, and fungi.Table 1

Table 1.   Development processes modeled on Drosophila using new genetic technologies

A detailed study of the mechanisms of RNAi made it possible to develop a number of approaches that use RNAi for targeted inactivation of gene expression: gene knockdown. RNAi as a mechanism of suppressing gene expression in Drosophila was first used by direct injection of dsRNA into early embryos to study the role of the Frizzled and Frizzled2 genes during early embryo development (Kennerdell and Carthew, 1998). Later, fly strain collections were obtained expressing short dsRNA hairpins (shRNAs) complementary to specific genes. dsRNA hairpins are expressed under the control of the GAL4/UAS system, allowing targeted suppression of gene expression in hybrids. The collection of transgenic knockdown strains currently covers approximately 12 000 genes, which makes up more than 80% of all known protein-coding genes in Drosophila. Collections are available at the Harvard Drosophila RNAi Screening Center (DRSC) (Ramadan et al., 2007) and the Vienna Drosophila RNAi Center (VDRC) (Dietzl et al., 2007).

RNAi leads to directed degradation of a specific mRNA in the cytoplasm, a process that typically leads to a decrease in gene expression but not to a complete lack of gene expression. A recent analysis of the effectiveness gene knockdown by RNAi gene showed that 90% of in vivo strains show residual gene expression (25% or more) (Perkins et al., 2015). Therefore, RNAi usually leads to a hypomorphic phenotype in which the amount of product encoded by the gene is significantly reduced but not completely absent. This can be an advantage, for example, for studying vital genes whose complete shutdown is lethal for the organism. However, in some cases, a hypomorphic phenotype interferes with the study, for example, if the gene is normally expressed at a low or very low level. It is especially difficult to control gene expression during individual development, when expression will be highly dependent on age.

RNAi usually leads to knockdown with approximately the same efficiency in all GAL4-expressing cells, although mosaic effects may be observed in some cases (Bosch et al., 2016). The effectiveness of RNAi is limited by the concentration of small RNA molecules. Thus, the effect of RNAi is not stable and ceases after the cessation of dsRNA synthesis. Side effects of RNAi can occur when the introduced RNA molecule has a sequence that is complementary to several genes at the same time, which leads to a decrease in the expression of several genes at once. Currently, a number of computer programs have been developed to select interfering RNAs with a high degree of reliability. Good results are obtained using the tag-mediated loss of gene function technique, when the studied gene is fused in the same translation frame as the GFP gene, and small interfering RNAs against GFP are used for RNAi (Neumüller et al., 2012).

SITE-DIRECTED MUTAGENESIS AND GENE EDITING

An important step in reverse genetics was the development of a method for targeted inactivation of genes using the bacterial system CRISPR/Cas9. RNAs transcribed from the CRISPR locus (crRNA) enter into a complex with transcoded CRISPR RNA (tracrRNA) and Cas9 caspase enzyme. The complex binds to complementary DNA, which is destroyed by the caspase. For experiments using the CRISPR/Cas9 system, small RNAs are combined into one, which is called guiding RNA (gRNA).

Transgenic Drosophila strains expressing Cas9 under the control of the promoters of the nanos (nos) or vasa genes (Fig. 5) have been obtained recently. These strains were used as recipients for the injection of plasmids expressing gRNA under the promoter of a small nuclear RNA U6, which significantly increased the efficiency of the method (Kondo and Ueda, 2013; Port et al., 2014).

Fig. 5.
figure 5

Gene knockout method using the CRISPR/Cas9 system. The activation of the system occurs in hybrids by crossing females expressing Cas9 under the nos gene promoter with males expressing guiding RNA under the U6 promoter. The resulting hybrids after crossing with the wild type can produce offspring with the studied gene turned off, for example, the white gene (mutant descendants are outlined).

The easiest way to modify a gene based on the CRISPR/Cas9 technology is to introduce short inserts/deletions (indels) by stimulating the nonhomologous joining of the DNA ends, which often leads to frameshift mutations and, consequently, gene shutdown or truncated protein synthesis. Since indel size is random, a significant number of cells will contain mutations that do not impair gene function. As a result, the obtained individuals are, as a rule, genetic mosaics consisting of cells with two, one, or no functional copy of the knockout gene (Port et al., 2014).

It has recently been shown that several CRISPR events can occur in the same cell at the same time. The co-CRISPR or coconversion method, originally developed for C. elegans, has also been used successfully in Drosophila (Kane et al., 2017). The method is based on the simultaneous injection of nos-Cas9 embryos with a mixture of gRNA to the gene of interest and a selective marker, the ebony gene. It is expected that mutations in the studied gene should be expected in any cell in which ebony is inactivated. Thus, offspring showing loss of ebony are selected for molecular analysis of the target gene (Kane et al., 2017).

The use of Cas9 caspase crosslinked with a fluorescent protein underlies the new CASFISH method (in situ fluorescence hybridization mediated by CRISPR-Cas9), which allows fluorescent labeling of target loci (Port et al., 2014). Caspase can be used to suppress transcription of a target gene (in the case when it binds to it in the promoter region, regulatory regions, or the beginning of the coding region); in addition, to suppress transcription, a repressor or transcription activator may be attached to the caspase. The introduced protein labels can be not only regulators but also reporters, for example, fluorescent proteins (YFP, GFP, mCherry, etc.) or epitopes (FLAG, STREPII, Myc, etc.) (Thorn, 2017). Labeled proteins can be visualized in vivo using fluorescence microscopy or immunohistochemistry, and epitomes can also be used in biochemical studies, for example, in complex purification of the target protein.

STUDYING EXPRESSION OF GENES IN SEPARATE CELLS

Sequencing of single cell transcriptomes (single cell RNA-sequencing, scRNA-seq) is an extremely important approach in ontogenetic studies. Using it, a general analysis of the early development of mammals has already been carried out. For the C. elegans nematode, a molecular atlas of embryonic development with cell resolution was compiled. Drosophila was no exception. One of the first works performed using single-cell RNA sequencing was devoted to the study of the mechanism of dose compensation during early embryonic development (Lott et al., 2011).

No less promising are the studies of the development of the central nervous system, including the brain. The Drosophila brain contains approximately 100 000 neurons; the number of their precursors is approximately 200 neuroblasts. Development management can be represented as a network. To study the transcriptional networks underlying the development of various neuroblast lines, Yang et al., (2016) marked and isolated neuroblasts specific to individual cell lines and sequenced their transcriptomes. Specific neuroblasts were labeled using the GAL4/UAS system and monitored throughout neurogenesis.

The wing imaginal disk of Drosophila is an important model system for studying tissue growth, epithelial morphogenesis, intercellular signaling, cell competition, etc. The expression patterns of the vast majority of genes in the wing disk are not known. In order to obtain a complete atlas of gene expression in the wing disk, Bageritz et al. (2019) used sequencing of individual cells and developed a new method for analyzing scRNA-seq data based on correlations of gene expression.

CONCLUSIONS

The century-old history of Drosophila in biology is accompanied by the constant expansion of new methods of manipulating the genome and obtaining collections of transgenic strains available to researchers, which number more than 100 000. Drosophila genomic and genetic resources are being created and constantly updated, bioinformatic approaches to genome analysis are being expanded. With the expansion of the methodological base, the possibilities of using Drosophila as a model for studying individual development processes are widening. Table 1 lists some examples of the use of Drosophila as a model using the above technologies in recent years. All this allows us to conclude that Drosophila will continue to be in demand in the near future as an object of research in developmental genetics.